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parvrprr PROTEINS rnMPRISINC RORRELTA POLYPEPTIDES ; 

TIKES THEREFOR 

Pa rlr grniind of +>i» Tnvention 

Lyme borreliosis is the most common tick-borne 
5 infectious disease in North America, Europe, and 

northern Asia. The causative bacterial agent of this 
disease, Borrelia burgdorferi, was first isolated and 
cultivated in 1982 (Burgdorferi, W.A. et al., Science 
2i6 ; 1317-1319 (1982); Steere, A.R. et a}.., N. Enql. J- 
10 Med. 308 ; 733-740 (1983)). With that discovery, a wide 
array of clinical syndromes, described in both the 
European and American literature since the early 20th 
century, could be attributed to infection by B . 
burgdorferi (Afzelius, A., Acta Derm. Venereol. 2 ; 120- 
15 125 (1921); Bannwarth, A., Arch. Psych iatr. 

HorvpnWrankh. 117 ; 161-185 (1944); Garin, C. and A. 
Bujadouz, -t Mprt . Lvon 71 ; 765-767 (1922); Herxheimer, 
K. and K. Hartmann, Ar-r.h. Dermatol. Svphilol. 61: 57-76, 

255-300 (1902)). 
20 The immune response to B. burgdorferi is 

characterized by an early, prominent, and persistent 
humoral response to the end of lagellar protein, p41 
(fla), and to a protein constituent of the protoplasmic 
cylinder, p93 (Szczepanski, A., and J.L. Benach, 
25 Microbiol. Rev. 55 :21 (1991)). The p41 flagellin 

antigen is an immunodominant protein; however, it shares 
significant homology with flagellins of other 
microorganisms and therefore is highly cross reactive. 
The p93 antigen is the largest immunodominant antigen of 
30 B. burgdorferi. Both the P 41 and p93 proteins are 

physically cryptic antigens, sheathed from the immune 
system by an outer membrane whose major protein 
constituents are the outer surface proteins A and B 
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(OspA and OspB) . OspA is a basic lipoprotein of 
approximately 31 kd, which is encoded on a large linear 
plasmid along with OspB, a basic lipoprotein of 
approximately 34 kd (Szczepanski, A. , and J.L. Benach, 
5 Microbiol. Rev. 55 ; 21 (1991))- Analysis of isolates of 
B . burgdorferi obtained from North America and Europe 
has demonstrated that OspA has antigenic variability, 
and that several distinct groups can be serologically 
and genotypically defined (Wilske, B., et al . , World J. 
10 Microbiol. 7 : 130 (1991)). Other Borrelia proteins 
demonstrate similar antigenic variability. 
Surprisingly, the immune response to these outer surface 
proteins tends to occur late in the disease, if at all 
(Craft, J. E. et al . , J. Clin Invest. 78 : 934-939 
15 (1986); Dattwyler, R.J. and B.J. Luft, Rheum, Clin. 

North Am. 15 : 727-734 (19.89)). Furthermore, patients 
acutely and chronically infected with B . burgdorferi 
respond variably to the different antigens, including 
OspA, OspB, OspC, OspD, p39, p41 and p93. 
20 Vaccines against Lyme borreliosis have been 

attempted. Mice immunized with a recombinant form of 
OspA are protected from challenge with the same strain 
of B . burgdorferi from which the protein was obtained 
(Fikrig, E. , et al . , Science 250 : 553-556 (1990)). 

2 5 Furthermore, passively transferred anti-OspA monoclonal 

antibodies (Mabs) have been shown to be protective in 
mice, and vaccination with a recombinant protein induced 
protective immunity against subsequent infection with 
the homologous strain of B .burgdorferi (Simon, M.M. , et 
30 al., J. Infect. Pis. 164 : 123 (1991)). Unfortunately, 
immunization with a protein from one strain does not 
necessarily confer resistance to a heterologous strain 
(Fikrig, E. et al . , J. Immunol. 7 : 2256-1160 (1992)), 
but rather, is limited to the homologous 'species' from 

3 5 which the protein was prepared. Furthermore, 
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immunization with a single protein from a particular 
strain of Borrelia will not confer resistance to that 
strain in all individuals. There is considerable 
variation displayed in OspA and OspB, as well as p93, 
5 including the regions conferring antigenicity. 

Therefore, the degree and frequency of protection from 
vaccination with a protein from a single strain depend" 
upon the response of the immune system to the. particular 
variation, as well as the frequency of genetic variation 
10 in B. burgdorferi. Currently, a need exists for a 

vaccine which provides immunogenic ity across species and 
to more epitopes within a species, as well as 
immunogenicity against more than one protein. 

.Summary of the I nvention 
15 The current invention pertains to chimeric Borrelia 

proteins which include two or more antigenic Borrelia 
polypeptides which do not occur naturally (in nature) in 
the same protein in Borrelia, as well as the nucleic 
acids encoding such chimeric proteins. The antigenic 
20 polypeptides incorporated in the chimeric proteins are 
derived from any Borrelia protein from any strain of 
Borrelia, and include outer surface protein (Osp) A, 
OspB, OspC, OspD, pl2, P 39, p41, p66, and p93. The 
proteins from which the antigenic polypeptides are 
25 derived can be from the same strain of Borrelia, from 

different strains, or from combinations of proteins from 
the same and from different strains. If the proteins 
from which the antigenic polypeptides are derived are 
OspA or OspB, the antigenic polypeptides can be derived 
30 from either the portion of the OspA or OspB protein 
present between the amino terminus and the conserved 
tryptophan of the protein (referred to as a proximal 
portion) , or the portion of the OspA or OspB protein 
present between the conserved tryptophan of the protein 
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and the carboxy terminus (referred to as a distal 
portion) . Particular chimeric proteins, and the 
nucleotide sequences encoding them, are set forth in 
Figures 23-37 and 43-46* 
5 The chimeric proteins of the current invention 

provide antigenic polypeptides of *a variety of Borrelia 
strains and/or proteins within a single protein. Such" 
proteins are particularly useful in imraunodiag9stic 
assays to detect the presence of antibodies to native 

10 Borrelia in potentially infected individuals as well as 
to measure T-cell reactivity , and can therefore be used 
as immunodiagnostic reagents. The chimeric proteins of 
the current invention are additionally useful as vaccine 
immunogens against Borrelia infection* 

15 For a better understanding of the present invention 

together with other and further objects, reference is 
made to the following description, taken together with 
the accompanying drawings. 

Brief Description of the Drawings 
20 Figure 1 summarizes peptides and antigenic domains 

localized by proteolytic and chemical fragmentation of 
OspA. 

Figure 2 is a comparison of the antigenic domains 
depicted in Figure 1, for OspA in nine strains of B. 
2 5 burgdorferi . 

Figure 3 is a graph depicting a plot of weighted 
polymorphism versus amino acid position among 14 OspA 
variants. The marked peaks are: a) amino acids 132-145; 
b) amino acids 163-177; c) amino acids 208-221, The 
30 lower dotted line at polymorphism value 1.395 demarcates 
statistically significant excesses of polymorphism at p 
= 0.05. The upper dotted line at 1.520 is the same, 
except that the first 29 amino acids at the monomorphic 
N-terminus have been removed from the original analysis. 
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Figure 4 depicts the amino acid alignment of 
residues 200 through 220 for OspAs from strains B31 and 
K48 as well as for the site-directed mutants 613, 625 , 
640, 613/625, and 613/640. Arrow indicates Trp216. 
5 Amino acid changes are underlined. 

Figure 5 is a helical wheel projection of residues 
204-217 of B31 OspA. Capital letters indicate 
hydrophobic residues; lower case letters indicate 
hydrophilic residues; +/- indicate positively/negatively 
10 charged residues. Dashed line indicates division of the 
alpha-helix into hydrophobic arc (above -the line) and 
polar arc (below the line). Adapted from France et al . 
( Biochem. Biophvs. Acta 1120 : 59 (1992)). 

Figure 6 depicts a phylogenic tree for strains of 
15 Borrelia described in Table I. The strains are as 
. follows: 1 = B31; 2 Pkal; 3 = ZS7; 4 = N40; 5 = 
25015; 6 = K48; 7 = DK29; 8 = PHei; 9 = Ip90; 10 = 
PTrob; 11 = ACAI; 12 = PGau; 13 = Ip3 ; 14 = PBo; 15 = 
PKo. 

20 Figure 7 depicts the nucleic acid sequence of OspA- 

B31 (SEW ID NO. 6) , and the encoded protein sequence 
(SEQ ID NO. 7) . 

Figure 8 depicts the nucleic acid sequence of OspA- 
K48 (SEQ ID NO. 8) , and the encoded protein sequence 
25 (SEQ ID NO. 9) . 

Figure 9 depicts the nucleic acid sequence of OspA- 
PGau (SEQ ID NO. 10), and the encoded protein sequence 
(SEQ ID NO. 11) . 

Figure 10 depicts the nucleic acid sequence of 
30 OspA-25015 (SEQ ID NO. 12), and the encoded protein 
sequence (SEQ ID NO. 13). 

Figure 11 depicts the nucleic acid sequence of 
OspB-B31 (SEQ ID NO. 21), and the encoded protein 
sequence (SEQ ID NO. 22). 
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Figure 12 depicts the nucleic acid sequence of 
0spC-B31 (SEQ ID NO. 29), and the encoded protein 
sequence (SEQ ID NO. 30). 

Figure 13 depicts the nucleic acid sequence of 
5 OspOK48 (SEQ ID NO. 31) , and the encoded protein 
sequence (SEQ ID NO. 32). 

Figure 14 depicts the nucleic acid sequence of 
OspC-PKo (SEQ ID NO. 33) , and the encoded protein 
sequence (SEQ ID NO. 34). 
10 Figure 15 depicts the nucleic acid sequence of 

OspC-pTrob (SEQ ID NO. 35) and the encoded protein 
sequence (SEQ ID NO. 36). 

Figure 16 depicts the nucleic acid sequence of p93- 
B31 (SEQ ID NO. 65) and the encoded protein sequence 
15 (SEQ ID NO. 66) . 

Figure 17 depicts the nucleic acid sequence of p93- 
K48 (SEQ ID NO. 67) . 

Figure 18 depicts the nucleic acid sequence of p93 
PBo (SEQ ID NO. 69) . 
20 Figure 19 depicts the nucleic acid sequence of p93 

pTrob (SEQ ID NO. 71). 

Figure 20 depicts the nucleic acid sequence of p93 
pGau (SEQ ID NO. 73) . 

Figure 21 depicts the nucleic acid sequence of p93 
25 25015 (SEQ ID NO. 75). 

Figure 22 depicts the nucleic acid sequence of p93 
pKo (SEQ ID NO. 77) . 

Figure 23 depicts the nucleic acid sequence of the 
OspA-K4 8/OspA-PGau chimer (SEQ ID NO. 85) and the 
3 0 encoded chimeric protein sequence (SEQ ID NO. 86) . 

Figure 24 depicts the nucleic acid sequence of the 
0spA-B31/0spA-PGau chimer (SEQ ID NO. 88) and the 
encoded chimeric protein sequence (SEQ ID NO. 89) . 
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Figure 25 depicts the nucleic acid sequence of the 
OspA-B31/OspA-K4 8 chimer (SEQ ID NO. 91) and the encoded 
chimeric protein sequence (SEQ ID NO. 92). 

Figure 26 depicts the nucleic acid sequence of the 
OspA-B31/OspA-25015 chimer (SEQ ID NO. 94) and the 
encoded chimeric protein sequence (SEQ ID NO. 95) . 

Figure 27 depicts the nucleic acid sequence of the 
OspA-K48/OspA-B31/OspA-K48 chimer (SEQ ID NO. .97). and 
the encoded chimeric protein sequence (SEQ ID NO. 98) . 

Figure 28 depicts the nucleic acid sequence of the 
OspA-B31/OspA-K48/OspA-B31/OspA-K48 chimer (SEQ ID NO. 
100) and the encoded chimeric protein sequence (SEQ ID 
NO. 101) . 

Figure 29 depicts the nucleic acid sequence of the 
15 OspA-B31/OspB-B31 chimer (SEQ ID NO. 103) and the 
encoded chimeric protein sequence (SEQ ID NO. 104). 

Figure 30 depicts the nucleic acid sequence of the 
OspA-B31/OspB-B31/OspC-B31 chimer (SEQ ID NO. 106) and 
the encoded chimeric protein sequence (SEQ ID NO. 107). 

Figure 31 depicts the nucleic acid sequence of the 
OspC-B31/OspA-B31/OspB-B31 chimer (SEQ ID NO. 109) and 
the encoded chimeric protein sequence (SEQ ID NO. 110) . 

Figure 32 depicts the nucleic acid sequence of the 
OspA-B31/p93-B31 chimer (SEQ ID NO. Ill) and the encoded 
25 chimeric protein sequence (SEQ ID NO. 112) . 

Figure 33 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (122-234) chimer (SEQ ID NO. 113) and 
the encoded chimeric protein sequence (SEQ ID NO. 114). 
Figure 34 depicts the nucleic acid sequence of the 
30 OspB-B31/p41-B31 (122-295) chimer (SEQ ID NO. 115) and 
the encoded chimeric protein sequence (SEQ ID NO. 116). 

Figure 3 5 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (140-234) chimer (SEQ ID NO. 117) and 
the encoded chimeric protein sequence (SEQ ID NO. 118). 



20 
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Figure 36 depicts the nucleic acid sequence of the 
OspB-B31/p41-B31 (140-295) chiroer (SEQ ID NO, 119) and 
the encoded chimeric protein sequence (SEQ ID NO, 12 0) . 
Figure 37 depicts the nucleic acid sequence of the 
5 OspB-B31/p41-B31 (122-234 ) /OspOB31 chimer (SEQ ID NO, 
121) and the encoded chimeric protein sequence (SEQ ID 
NO. 122). 

Figure 38 depicts an alignment of the nucleic acid 
sequences for OspC-B31 (SEQ ID NO. 29) , OspC-PKo (SEQ ID 

10 NO. 33) , OspC-pTrob (SEQ ID NO, 35) , and OspC-K48 (SEQ 
ID NO. 31) . Nucleic acids which are identical to those 
in the lead nucleic acid sequence (here, OspC-B31) are 
represented by a period (.); differing nucleic acids are 
shown in lower case letters. 

15 Figure 39 depicts an alignment of the nucleic acid 

sequences for OspD-pBO (SEQ ID NO. 123), OspD-PGau (SEq 
ID NO. 124), OspD-DK29 (SEQ ID NO. 125), and OspD-K48 
(SEQ ID NO. 126) . Nucleic acids which are identical to 
those in the lead nucleic acid sequence (here, OspD-pBo) 

2 0 are represented by a period (.); differing nucleic acids 
are shown in lower case letters. 

Figure 40 depicts the nucleic acid sequence of p41- 
B31 (SEq ID NO. 127) and then encoded protein sequence 
(SEQ ID NO. 128) . 

25 Figure 41 depicts an alignment of the nucleic acid 

sequences for p41-B31 (SEQ ID NO. 127), p41-pKal (SEQ ID 
NO. 129), p41-PGau (SEQ ID NO. 51), p41-PBo (SEQ ID NO. 

130) , p41-DK29 (SEQ ID NO. 53), and p41-PKo (SEQ ID NO. 

131) . Nucleic acids which are identical to those in the 
30 lead nucleic acid sequence (here, p41-B31) are 

represented by a period (.); differing nucleic acids are 
shown in lower case letters. 

Figure 42 depicts an alignment of the nucleic acid 
sequences for OspA-B31 (SEQ ID NO. 6) , OspA-pKal (SEQ ID 
35 NO. 132), OspA-N40 (SEQ ID NO. 133), OspA-2S7 (SEQ ID 
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NO. 134), OspA-25015 (SEQ ID NO. 12), OspA-pTrob (SEQ ID 
NO. 13 5), OspA-K4 8 (SEQ ID NO. 8), OspA-Hei (SEQ ID NO. 
136), OspA-DK29 (SEQ ID NO. 49), OSpA-Ip90 (SEQ ID NO. 
50), OspA-pBo (Seq ID NO. 55), OspA-Ip3 (SEQ ID NO. 56), 
5 OspA-PKo (SEQ ID NO. 57), OspA-ACAI (SEQ ID NO. 58), and 
OspA-PGau (SEQ ID NO. 10) . Nucleic acids which are 
identical to those in the lead nucleic acid sequence 
(here, OspA-B31) are represented by a period {.); 
differing nucleic acids are shown in lower case letters. 
!0 Figure 43 depicts the nucleic acid sequence of the 

OspA-Tro/OspA-Bo chimer (SEQ ID NO. 137*) and the encoded 
chimeric protein sequence (SEQ ID NO. 138). 

Figure 44 depicts the nucleic acid sequence of the 
OspA-PGau/OspA-Bo chimer (SEQ ID NO. 139) and the 
15 encoded chimeric protein sequence (SEQ ID NO. 140). 

Figure 45 depicts the nucleic acid sequence of the 
OspA-B31/OspA-PGau/OspA-B31/OspA-K48 chimer (SEQ ID NO. 
141) and the encoded chimeric protein sequence (SEQ ID 
NO. 142). 

20 Figure 4 6 depicts the nucleic acid sequence of the 

OspA-PGau/OspA-B31/OspA-K48 chimer (SEQ ID NO. 14 3) and 
the encoded chimeric protein sequence (SEQ ID NO. 144) . 



Detailed Description of the Invention 

The current invention pertains to chimeric proteins 

25 comprising antigenic Borrelia polypeptides which do not 
occur in nature in the same Borrelia protein. The 
chimeric proteins are a combination of two or more 
antigenic polypeptides derived from Borrelia proteins. 
The antigenic polypeptides can be derived from different 

30 proteins from the same species of Borrelia, or different 
proteins from different Borrelia species, as well as 
from corresponding proteins from different species. As 
used herein, the term "chimeric protein" describes a 
protein comprising two or more polypeptides which are 
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derived from corresponding and/or non-corresponding 
native Borrelia protein. A polypeptide "derived from" a 
native Borrelia protein is a polypeptide which has an 
amino acid sequence the same as an amino acid sequence 
5 present in a Borrelia protein, an amino acid sequence 
equivalent to the amino acid sequence of a naturally 
occurring Borrelia protein, or an amino acid sequence 
substantially similar to the amino acid sequence of a 
naturally occurring Borrelia protein (e.g., differing by 
10 few amino acids) such as when a nucleic acid encoding a 
protein is subjected to site-directed mutagenesis. 
"Corresponding" proteins are equivalent proteins from 
different species or strains of Borrelia, such as outer 
surface protein A (OspA) from strain B31 and OspA from 
15 strain K48. The invention additionally pertains to 
nucleic acids encoding these chimeric proteins. 

As described below, Applicants have identified two 
separate antigenic domains of OspA and OspB which flank 
the sole conserved tryptophan present in OspA and in 
2 0 OspB. These domains share cross-reactivity with 

different genospecies of Borrelia. The precise amino 
acids responsible for antigenic variability were 
determined through site-directed mutagenesis, so that 
proteins with specific amino acid substitutions are 
25 available for the development of chimeric proteins. 

Furthermore, Applicants have . identified immunologically 
important hypervariable domains in OspA proteins, as 
described below in Example 2. The first hypervariable 
domain of interest for chimeric proteins, Domain A, 
30 includes amino acid residues 120-140 of OspA, the second 
hypervariable domain, Domain B, includes residues 150- 
180 and the third hypervariable domain, Domain C, 
includes residues 200-216 or 217 (depending on the 
position of the sole conserved tryptophan residue in the 
35 OspA of that particular species of Borrelia) (see Figure 
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3). In addition, Applicants have sequenced the genes 
for several Borrelia proteins. 

These discoveries have aided in the development of 
novel recombinant Borrelia proteins which include two or 
5 more amino acid regions or sequences which do not occur 
in the same Borrelia protein in nature. The recombinant 
proteins comprise polypeptides from a variety of 
Borrelia proteins, including, but not limited. to, OspA, 
OspB, OspC, OspD, pl2, p39, p41, p66, and p93. 
10 Antigenically relevant polypeptides from each of a 

number of proteins are combined into a 'single chimeric 
protein. 

In one embodiment of the current invention, chimers 
are now available which include antigenic polypeptides 
15 flanking a tryptophan residue. The antigenic 

polypeptides are derived from either the proximal 
portion from the tryptophan (the portion of the OspA or 
OspB protein present between the amino terminus and the 
conserved tryptophan of the protein) , or the distal 
20 portion from the tryptophan (the portion of the OspA or 
OspB protein present between the conserved tryptophan of 
the protein and the carboxy terminus) in OspA and/or 
OspB. The resultant chimers can be OspA-OspA chimers 
(i.e., chimers incorporating polypeptides derived from 
25 OspA from different strains of Borrelia) , OspA-OspB 

chimers, or OspB-OspB chimers, and are constructed such 
that amino acid residues amiho-proximal to an invariant 
tryptophan are from one protein and residues carboxy- 
proximal to the invariant tryptophan are from the other 
30 protein. For example, one available chimer consists of 
a polypeptide derived from the amino-proximal region of 
OspA from strain B31, followed by the tryptophan 
residue, followed by a polypeptide derived from the 
carboxy-proximal region of OspA from strain K48 (SEQ ID 
35 NO. 92). Another available chimer includes a 
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polypeptide derived from the amino-proximal region of 
OspA from strain B31, and a polypeptide derived from the 
carboxy-proximal region of OspB from strain B31 (SEQ ID 
NO. 104). If the polypeptide proximal to the tryptophan 
5 of these chimeric proteins is derived from OspA, the 
proximal polypeptide can be further subdivided into the 
three hypervariable domains (Domains A, B, and C) , each 
of which can be derived from OspA from a different 
strain of Borrelia. These chimeric proteins can further 

10 comprise antigenic polypeptides from another protein, in 
addition to the antigenic polypeptides flanking the 
tryptophan residue. 

In another embodiment of the current invention, 
chimeric proteins are available which incorporate 

15 antigenic domains of two or more Borrelia proteins, such 
as Osp proteins (Osp A, B, C and/or D) as well as pl2, 
p39, p41, p66, and/or p93 . 

The chimers described herein can be produced so 
that they are highly soluble, hyper-produced in E. coli, 

20 and non-lipidated. In addition, the chimeric proteins 
can be designed to end in an affinity tag (His-tag) to 
facilitate purification. The recombinant proteins 
described herein have been constructed to maintain high 
levels of antigenicity. In addition, recombinant 

25 proteins specific for the various genospecies of 

Borrelia that cause Lyme disease are now available, 
because the genes from each of the major genospecies 
have been sequenced; the sequences are set forth below. 
These recombinant proteins with their novel biophysical 

3 0 and antigenic properties will be important diagnostic 
reagent and vaccine candidates. 

The chimeric proteins of the current invention are 
advantageous in that they retain specific reactivity to 
monoclonal and polyclonal antibodies against wild-type 

3 5 Borrelia proteins, are immunogenic, and inhibit the 
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growth or induce lysis of Borrelia in vitro. 
Furthermore, in some embodiments, the proteins provide 
antigenic domains of two or more Borrelia strains and/or 
proteins within a single protein. Such proteins are 
particularly useful in immuno-diagostic assays. For 
example, proteins of the present invention can be used 
as reagents in assays to detect the presence of 
antibodies to native Borrelia in potentially infected 
individuals. These proteins can also be used as 
immunodiagnostic reagents, such as in dot blots, Western 
blots, enzyme linked immunosorbed assays, or 
agglutination assays. The chimeric proteins of the 
present invention can be produced by known techniques, 
such as by recombinant methodology, polymerase chain 
15 reaction, or mutagenesis. 

Furthermore, the proteins of the current invention 
are useful as vaccine immunogens against Borrelia 
infection. Because Borrelia has been shown to be 
clonal, a protein comprising antigenic polypeptides from 
a variety of Borrelia proteins and/or species, will 
provide immunoprotection for a considerable time when 
used in a vaccine. The lack of significant intragenic 
recombination, a process which might rapidly generate 
novel epitopes with changed antigenic properties, 
25 ensures that Borrelia can only change antigenic type by 
accumulating mutational change, which is slow when 
compared with recombination in generating different 
antigenic types. The chimeric protein can be combined 
with a physiologically acceptable carrier and 
administered to a vertebrate animal through standard 
methods (e.g., intravenously or intramuscularly, for 
example) . 



20 



30 
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Th e current invention is illustrated by the 
following Examples, which are not to be construed to be 
limiting in any way- 



(OspA) to homogeneity, and describes mapping of the 
10 antigenic specificities of several anti-OspA MAbs. OspA 
was purified to homogeneity by exploiting its resistance 
to trypsin digestion. Intrinsic labeling with l4 C- 
palmitic acid confirmed that OspA was lipidated, and 
partial digestion established lipidation at the amino- 
15 terminal cysteine of the molecule* 

The reactivity of seven anti-OspA murine monoclonal 
antibodies to nine different Borrelia isolates was 
ascertained by Western blot analysis. Purified OspA was 
fragmented by enzymatic or chemical cleavage, and the 
20 monoclonal antibodies were able to define four distinct 
immunogenic domains (see Figure 1) • Domain 3, which 
included residues 190-220 of OspA, was reactive with 
protective antibodies known to agglutinate the organism 
in vitro, and included distinct specificities, some of 
25 which were not restricted to a genotype of B. 
Jburgdorreri . 



5 



Example 1. 



Purif ic^.tion of Borrelia buraorferi Outer 
Surface Protein A and Analysis of 
Antibody Binding Domains 



This example details a method for the purification 
of large amounts of native outer surface protein A 
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A. Purification of Native OspA 

Detergent solubilization of B . burgdorferi strips 
the outer surface proteins and yields partially-purified 
preparations containing both OspA and outer surface 

5 protein B (Osp B) (Barbour, A.G. et ai . , Infect, Immun. 
52 (5) ; 549-554 (1986); Coleman, J.L. and J.L. Benach, J 
Infect. Pis. 155 (41 : 756-765 (1987); Cunningham, T.M.* 
et al., Ann. NY Acad, Sci. 539 : 376-378 (1988*; Brandt, 
M.E. et al., Infect. Immun. 58 : 983-991 (1990); Sambri, 

10 V. and R. Cevenini, Microbiol. 14 :307-314 (1991)). 

Although both OspA and OspB are sensitive to proteinase 
K digestion, in contrast to OspB, OspA is resistant to 
cleavage by trypsin (Dunn, J. et al . , Prot. Exp. Pur if ... 
1: 159-168 (1990); Barbour, A.G. et al . , Infect. Immun. 

15 45:94-100 (1984) ) . The relative insensitivity to 

trypsin is surprising in view of the fact that Osp A has 
a high (16% for B31) lysine content, and may relate to 
the relative configuration of Osp A and B in the outer 
membrane . 

20 Intrinsic KadiolaJbeling of Borrelia 

Labeling for lipoproteins was performed as 
described by Brandt et al . ( Infect. Immun. 58:983-991 
(1990)). l4 C-palmitic acid (ICN, Irvine, California) was 
added to the BSK II media to a final concentration of 

25 0.5 /iCi per milliliter (ml). Organisms were cultured at 
34 °C in this medium until a density of lb 8 cells per ml 
was achieved. 

Purification of OspA Protein from Borrelia Strain B31 
Borrelia burgdorferi, either I4 C-palmitic acid- 
3 0 labeled or unlabeled, were harvested and washed as 

described (Brandt, M.E. et ai . , Infect. Immun. 58:983- 
991 (1990)). Whole organisms were trypsinized according 
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to the protocol of Barbour et al . (Infect. immun. 45:94-, 
100 (1984)) with some modifications. The pellet was 
suspended in phosphate buffered saline (PBS, lOmM, pH 
7.2), containing 0.8% tosyl-L-phenylalanine chloromethyl 
5 ketone (TPCK) -treated trypsin (Sigma, St. Louis, 

Missouri), the latter at a ratio of 1 /ig per 10 8 cells. 
Reaction was carried out at 25°C for 1 hour, following' 
- which the cells were centrifuged. The pellet was. washed 
in PBS with 100 fig/ml phenylmethylsulf onyl fluoride 
10 (PMSF) . Triton X-114 partitioning of the pellet was 
carried out as described by Brandt et al . (Infect^ 
Immun. 58 :983-991 (1990)). Following trypsin treatment, 
cells were resuspended in ice-cold 2% (v/v) Triton X-114 
in PBS at 10 9 cells per ml. The suspension was rotated 
15 overnight at 4°C, and the insoluble fraction removed as 
a pellet after centrif ugation at 10,000 X g for 15 
minutes at 4°C. The supernatant (soluble fraction) was 
incubated at 37 °C for 15 minutes and centrifuged at room 
temperature at 1000 X g for 15 minutes to separate the 
20 aqueous and detergent phases. The aqueous phase was 
decanted, and ice cold PBS added to the lower Triton 
phase, mixed, warmed to 37°C, and again centrifuged at 
1000 X g for 15 minutes. Washing was repeated twice 
more. Finally, detergent was removed from the 
25 preparation using a spin column of Bio-beads SM2 

(BioRad, Melville, New York) as described (Holloway, 
P.W., Anal. Biochem. 53 :304-308 (1973)). 

Ion exchange chromatography was carried out as 
described by Dunn et al. ( Prot. Exp. Pur if. l ; 159-168 
30 (1990)) with minor modifications. Crude OspA was 

dissolved in buffer A (1% Triton X-100, lOmM phosphate 
buffer (pH 5.0)) and loaded onto a SP Sepharose resin 
(Pharmacia, Piscataway, New Jersey) , pre-equilibrated 
with buffer A at 25°C. After washing the column with 1C 
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bed-volumes of buffer A, the bound OspA was eluted with 
buffer B (1% Triton X-100, lOmM phosphate buffer (pH 
8.0))- OspA fractions were detected by protein assay 
using the BCA method (Pierce, Rockford, Illinois), or as 
5 radioactivity when intrinsically labeled material was 
fractionated. Triton X-100 was removed using a spin 
column of Bio-beads SM2. 

This method purifies OspA from an outer surface 
membrane preparation. In the absence of trypsin- 
10 treatment, OspA and B were the major components of the 
soluble fraction obtained after Triton partitioning of 
strain B31. In contrast, when Triton extraction was 
carried out after trypsin-treatment , the OspB band is 
not seen. Further purification of OspA-B31 on a SP 
15 Sepharose column resulted in a single band by SDS-PAGE. 
The yield following removal of detergent was 
approximately 2 mg per liter of culture. This method of 
purification of OspA, as described herein for strain 
B31, can be used for other isolates of Borrelia as well. 
20 For strains such as strain K48, which lack OspB, trypsin 
treatment can be omitted. 

Lipidation site of OspA-B31 

,4 C-palmitic acid labeled OspA from strain B31 was 
purified as described above and partially digested with 
25 endoproteinase Asp-N (data not shown) . Following 
digestion, a new band of lower molecular weight was 
apparent by SDS-PAGE, found by direct amino-terminal 
sequencing to begin at Asp M . This band had no trace of 
radioactivity by autoradiography (data not shown) . OspA 
30 and B contain a signal sequence (L-X-Y-C) similar to the 
consensus described for lipoproteins of E. coli, and it 
has been predicted that the lipidation site of OspA and 
B should be the amino-terminal cysteine (Brandt, M.E. et 
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al., Infect. Immun 58 : 983-991 (1990)). The results 
presented herein support this prediction. 

B. Comparison of OsdA Antibody Binding Regions in Nine 
Strains of Borrelia burgdorferi 
5 The availability of the amino acid sequenced for 

OspA from a number of different isolates, combined with 
peptide mapping and Western blot analysis, permitted the 
identification of the antigenic domains recognized by 
monoclonal antibodies (MAbs) and allowed inference of 
10 the key amino acid residues responsible for specific 
antibody reactivity. 

Strains of Borrelia burgdorferi 

Nine strains of Borrelia, including seven European 
strains and two North American strains, were used in 
15 this study of antibody binding domains of several 
proteins. Information concerning the strains is 
summarized in Table I, below. 
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Table I. Representative Borrelia Strains 



Strain 


Location and Source 


Reference for Strain 


YA ft 
I\.*x o 


Czechoslovakia, 
Ixodes ricinus 


none 


roaU 


nprmanv. human ACA 


wiIrIcp/'B. et al.. J. Clin. 
Microbiol. 32:340-350 


(1993) 


DK29 


Denmark, human EM 


Wilske, B. et al. 


PKo 


Germany, human EM 




PTrob 


Germany, human skin 


Wilske, B. et'al. 


I P 3 


Khabarovsk, Russia, 
I. persulcatus 


Asbrink, E. et al.. Acta 
Derm. Venereol. 64: 506-512 


(1984) 


Ip90 


Khabarovsk, Russia, 
J. persulcatus 


Asbrink, E. et al. 


25015 


MxllDrooK, IN I, J- • 
persulcatus 


Barbour, A.G. et al . , Curr. 
Microbiol. 8:123-126 (1983) 


B31 


Shelter Island, NY, 
X. sca.puia.rxs 


Lvif*- n -t. pr. ^1.. Infect. 
Immun. 60: 4309-4321 


(1992) ; ATCC 35210 


PKal 


Germany, human CSF 


Wilske, B. et al . 


ZS7 


Fre iburg , Germany , 
J. ricinus 


Wallich, R. et al . , Nucl 
Acids Res. 17: 8864 (1989) 


N40 


Westchester Co., NY 


Fikria v. <=t- *7_. Science 
250:553-556 (1990) 


PHei 


Germany, human CSF 


Wilske, B. et al. 


ACAI 


Sweden, human ACA 


Luft, B. J. et al. , FEMS 
Microbiol. Lett. 93:73-68 
(1992) 


PBo 


Germany, human CSF 


Wilske, B. et al. 



ACA = Datient with acrodermatitis chronica atrophicans; 
EM = patient with erythema migrans; CSF = cerebrospinal 
fluid of patient with Lyme disease 



Strains K48, PGau and DK29 were supplied by R. 
Johnson, University of Minnesota; PKo and pTrob were 
provided by B. Wilske and V. Preac-Mursic of the 
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Pettenkhofer Institute, Munich, Germany; and Ip3 and Ip90 
were supplied by L. Mayer of the Center for Disease 
Control, Atlanta, Georgia, The North American strains 
included strain 25015, provided by J. Anderson of the 
5 Connecticut Department of Agriculture; and strain B31 (ATCC 
35210) . : 

Monoclonal Antibodies 

Seven monoclonal antibodies (MAbs) were utilized in 
this study. Five of the MAbs (12, 13, 15, 83 and 336) were 

10 produced from hybridomas cloned and subcloned as previously 
described (Schubach, W.H., et ai M Infect. Immun. 
59161:1911-1915 (1991)). MAbH5332 (Barbour, A.G. etal., 
Infect. Immun. 41 :795-804 (1983)) was a gift from Drs . Alan 
Barbour, University of Texas, and MAb CIII.78 (Sears, J.E. 

15 et al,, J. Immunol. 147 (6) : 1995-2000 (1991)) was a gift 
from Richard A. Flavell, Yale University. MAbs 12 and 15 
were raised against whole sonicated B3 ; MAb 336 was 
produced against whole PGau; and MAbs 13 and 83 were raised 
to a truncated form of OspA cloned from the K48 strain and 

20 expressed in E. coli using the T7 RNA polymerase system 
(McGrath, B.C. et al., Vaccines , Cold Spring Harbor 
Laboratory Press, Plainview, New York, pp. 365-370 (1993)). 
All MAbs were typed as being Immunoglobulin G (IgG) . 

Methods of Protein Cleavage, Western Blotting, and 

25 Amino -Terminal Sequencing 

Prediction of the various cleavage sites was achieved 
by knowledge of the primary amino acid sequence derived 
from the full nucleotide sequences of OspA, many of which 
are currently available (see Table II, below) . Cleavage 

3 0 sites can also be predicted based on the peptide sequence 
of OspA, which can be determined by standard techniques 
after isolation and purification of OspA by the method 
described above. Cleavage of several OspA isolates was 
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conducted to determine the localization of monoclonal 
antibody binding of the proteins. 

Hydroxylamine-HCl (HA) , N-chlorosuccinimide (NCS) , and 
cyanogen bromide cleavage of OspA followed the methods 
5 described by Bornstein ( Biochem. 9 (121:2408-2421 (1970)), 
Shechter et al . , ( Biochem . 15 (23) :5Q71-5075 (1976)), and 
Gross (in Hirs, C.H.W. (ed) : Methods in Enzymologv. (N.Y. 
Acad. Press), 11:238-255 (1967)) respectively. Protease 
cleavage by endoproteinase, Asp-N (Boehringer Mannheim, 
10 Indianapolis, Indiana), was performed as described by 
Cleveland D.W. et al., (J. Biol. Chem. 252 :'ll02-1106 
(1977)) . Ten micrograms of OspA were used for each 
reaction. The ratio of enzyme to OspA was approximately 1 
to 10 (w/w) . 

15 Proteins and peptides generated by cleavage were 

separated by SDS-polyacrylamide gel electrophoresis (SDS- 
PAGE) (Laemmli, U.K., Nature (London) 227:680-685 (1970)), 
and electroblotted onto immobilon Polyvinyl idine Difluoride 
(PVDF) membranes (Ploskal, M.G.'et al., Biotechniaues 
20 4:272-283 (1986)) . They were detected by amido black 

staining or by immunostaining with murine MAbs, followed by 
alkaline phosphatase -conjugated goat antimouse IgG. 
Specific binding was detected using a 5-bromo-4-chloro-3- 
indolylphosphate (BCIP) /nitroblue tetrazolium (NBT) 
25 developer system (KPL Inc., Gathersburg, Maryland). 

In addition, amino -terminal amino acid sequence 
analysis was carried out on several cleavage products, as 
described by Luft et al. (Infect. Immun. 57:3637-3645 
(1989)) . Amido black stained bands were excised from PVDF 
30 blots and sequenced by Edman degradation using a Biosystems 
model 475A sequenator with model 12 OA PTH analyzer and 
model 900A control/data analyzer. 
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Cleavage Products of Outer Surface Protein A Isolates 
Purified OspA-B31, labeled with 14 C-paltnitic acid, was 
fragmented with hydroxylamine-HCl (HA) into two peptides, 
designated HA1 and HA2 (data not shown) . The HA1 band 
5 migrated at 27 KD and retained its radioactivity, 

indicating that the peptide included the lipidation site at 
the N-terminus of the molecule (data :aot shown) . From the 
predicted cleavage point, HA1 should correspond to residues 
1 to 251 of 0spA-B31. HA2 had a MW of 21.6 KD by SDS-PAGE, 
10 with amino-terminal sequence analysis showing it to begin 
at Gly72, i.e. residues 72 to 273 of OspA-B31. By 
contrast, HA cleaved OspA-K48 into three peptides, 
designated HA1, HA2 , and HA3 with apparent MWs of 22KD, 16 
KD and 12 KD, respectively. Amino-terminal sequencing 
15 showed HA1 to start at Gly72, and HA3 at Glyl42. HA2 was 

found to have a blocked amino- terminus, as was observed for 
the full-length OspA protein. HA1, 2 and 3 of OspA-K48 
were predicted to be residues 72-274, 1 to 141 and 142 to 
274 , respectively. 
20 N-Chlorosuccinimide (NCS) cleaves tryptophan (W) , 

which is at residue 216 of 0spA-B31 or residue 217 of OspA- 
K48 (data not shown) . NCS cleaved OspA-B31 into 2 
fragments, NCS1, with MW of 23 KD, residues 1-216 of the 
protein, and NCS 2 with a MW of 6.2 KD, residues 217 to 273 
25 (data not shown) . Similarly, K48 OspA was divided into 2 
pieces, NCS1 residues 1-217, and NCS 2 residues 218 to 274 
(data not shown) . 

Cleavage of OspA by cyanogen bromide (CNBr) occurs at 
the carboxy side of methionine, residue 39. The major 
3 0 fragment, CNBrl, has a MW of 25.7 KD, residues 39-274 by 
amino-terminal amino acid sequence analysis (data not 
shown) . CNBr2 (about 4 KD) could not be visualized by 
amido black staining; instead, lightly stained bands of 
about 20 KD MW were seen. These bands reacted with anti- 
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OspA MAbs , and most likely were degradation products due to 
cleavage by formic acid. 

Determination of Antibody Binding Domains for Anti- 
OspA Monoclonal Antibodies 
5 The cleavage products of 0spA-B31 and 0spA-K48 were 

analyzed by Western blot to assess their ability to bind to 
the six different MAbs. Preliminary Western blot- analysis 
of the cleavage products demonstrated that strains K48 and 
DK29 have similar patterns of reactivity, as do IP3, PGau 
10 and PKo. The OspA of strain PTrob was immunologically 

distinct from the others, being recognized only by MAb 33 6. 
MAb 12 recognized only the two North American strains, B31 
and 25015. When the isolates were separated into 
genogroups, it was remarkable that all the MAbs, except MAb 
15 12, crossed over to react with multiple genogroups. 

MAbl2, specific for 0spA-B31, bound to both HA1 and 
HA2 of OspA-B31. However, cleavage of 0spA-B31 by NCS at 
residue Trp216 created fragments which did not react with 
MAbl2, suggesting that the relevant domain is near or is 
20 structurally dependent upon the integrity of this residue 
(data not shown). MAb 13 bound only to OspA-K48, and to 
peptides containing the amino -terminus of that molecule 
(e.g. HA2; NCS1) . It did not bind to CNBrl residues 39 to 
274. Thus the domain recognized by MAbl3 is in the amino- 
25 terminal end of OspA-K48, near Met3 8. 

MAblS reacts with the OspA of both the B31 and K48 
strains, and to peptides containing the N- terminus of OspA, 
such as HA1 of OspA-B31 and NCS1, but not to peptides HA2 
of OspA-B31 and HA1 of OspA-K48 (data not shown) . Both 
30 peptides include residue 72 to the C-terminus of the 

molecules. MAblS bound to CNBrl of OspA-K48, indicating 
the domain for this antibody to be residues 39 to 72, 
specifically near Gly72 (data not shown) . 
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MAb83 binds to OspA-K48, and to peptides containing 
the C- terminal portion of the molecule, such as HA1. They 
do not bind to HA2 of OspA-K48, most likely because the C- 
terminus of HA2 of OspA-K48 ends at 141. Similar to MAbl2 
5 and 0spA-B31, binding of MAbs 83 and CIII.78 is eliminated 
by cleavage of OspA at the tryptophan residue. Thus 
binding of MAbs 12, 83 and CIII.78 to OspA depends on the' 
structural integrity of the Trp 216 residue, which appears 
to be critical for antigenicity. Also apparent is that, 
10 although these MAbs bind to a common antigenic domain, the 
precise epitopes which they recognize are distinct from one 
another given the varying degrees. of cross -reactivity to 
these MAbs among strains. 

Although there is similar loss of binding activity of 
15 MAb336 with cleavage at Trp 216 , this MAb does not bind to 
HA1 of OspA-B31, suggesting the domain for this antibody 
includes the carboxy- terminal end of the molecule, 
inclusive of residues 251 to 273. Low MW peptides, such as . 
HA3 (10 KD) and NCS2 (6KD) , of OspA-K48 do not bind this. 
20 MAb on Western blots. In order to confirm this 

observation, we tested binding of the 6 MAbs with a 
recombinant fusion construct p3A/EC that contains a trpE 
leader protein fused with residues 217 to 273 of OspA-B31 
(Schubach, W.H. et al . , Infect. I mmun. 59 (6) : 1911-1915 
25 (1991)). Only MAb336 reacted with this construct (data not 
shown) . Peptides and antigenic domains localized by 
fragmentation of OspA are summarized in Figure 1. 

Mapping of Domains to Define the Molecular Basis for 
the Serotype Analysis 
30 To define the molecular basis for the serotype 

analysis of OspA, we compared the derived amino acid 
sequences of OspA for the nine isolates (Figure 2) . At the 
amino terminus of the protein, these predictions can be 
more precise given the relatively small number of amino 
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acid substitutions in this region compared to the carboxy 
terminus. Domain 1 . which is recognized by MAbl3 , includes 
residues Leu34 to Leu41. MAbl3 only binds to the OspA of 
species K48, DK29 and IP90. Within this region, residue 37 
5 is variable, however Gly37 is conserved amongst the three 
reactive strains. When Gly37 is changed to Glu37, as it is 
in OspA of strains B31, pTrob, PGau, and PKo, MAbl3 does 
not recognize the protein (data not shown) . By similar 
analysis, it can be seen that Asp70 is a crucial residue 
10 for Domain 2 , which includes residues 65 to 75 and is 

recognized by MAblS. Domain 3 is reactive with MAbs H5332, 
12 and 83, and includes residues 190-220. It is clear that 
significant heterogeneity exists between MAbs reactive with 
this domain, and that more than one conformational epitope 
15 must be contained within the sequence. Domain 4 binds 

MAb33 6, and includes residues 250 to 270. In this region, 
residue 266 is variable and therefore may be an important 
determinant. It is apparent, however, that other 
determinants of the reactivity of this monoclonal antibody 
20 reside in the region comprising amino acids 217-250. 
Furthermore, the structural integrity of Trp216 is 
essential for antibody reactivity in the intact protein. 
Finally, it is important to stress that Figure 2 indicates 
only the locations of the domains, and does not necessarily 
25 encompass the entire domain. Exact epitopes are being 

analyzed by site-directed mutagenesis of specific residues. 

Overall, evidence suggests that the N-terminal portion 
is not the immunodominant domain of OspA, possibly by 
virtue of its lipidation, and the putative function of the 
30 lipid moiety in anchoring the protein to the outer 
envelope. The C-terminal end is immunodominant and 
includes domains that account in part for structural 
heterogeneity (Wilske, B. et al., Med. Microbiol . Tmmunol., 
181 ; 191-207 (1992)), and may provide epitopes for antibody 



WO 95/12676 PCIYUS94/12352 



-26- 
neutralization (Sears, J.E. et al . , J. Immunol. 147 (6) : 
1995-2000 (1991)), and relate to other activities, such as 
the induction of T-cell proliferation (Shanafel, M.M., et 
al., J. Immunol. 148 : 218-224 (1992)). There are common 
5 epitopes in the carboxy-end of the protein that are shared 
among genospecies which may have immlinoprotective potential 
(Wilske, B., et al., Med . Microbiol . Immunol . 181 : 191-207 
(1992)). 

Prediction of secondary structure on the basis of 
10 hydropathy analysis and circular dichroism and fluorescence 
spectroscopy measurements (McGrath, B.C., et al . , Vaccines, 
Cold Spring Harbor Laboratory Press, Plainview, New York; 
pp. 365-370 (1993)) suggest domains 3 and 4 to be in a 
region of the molecule with a propensity to form alpha- 
15 helix, whereas domains 1 and 2 occur in regions predicted 
to be beta-sheets (see Figure 1) . These differences may 
distinguish domains in accessibility to antibody or to 
reactive T-cells (Shanafel, M.M. et al. . % J. Immunol. 148: 
218-224 (1992)). Site-directed mutagenesis of specific 
20 epitopes, as described below in Example 2, aids in 
identifying exact epitopes. 



Example 2 . Identification of an Immunologically 

Important Hvpervariable Domain of th e Manor 
Outer Surface Protein A of Borrelisi 

25 This Example describes epitope mapping studies using 

chemically cleaved OspA and TrpE-OspA fusion proteins. The 
studies indicate a hypervariable region surrounding the 
single conserved tryptophan residue of OspA (at residue 
216, or in some cases 217), as determined by a moving 

30 window population analysis of OspA from fifteen European 

and North American isolates of Borrelia. The hypervariable 
region is important for immune recognition. 
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Site-directed mutagenesis was also conducted to 
examine the hypervariable regions more closely. 
Fluorescence and circular dichroism spectroscopy have 
indicated that the conserved tryptophan is part of an 
5 alpha-helical region in which the tryptophan is buried in a 
hydrophobic environment (McGrath, B..C, et al., Vaccines , ; 
Cold Spring Harbor Laboratory Press, Plainview, New York; 
pp. 365-370 (1993)). More polar amino acid side -chains 
flanking the tryptophan are likely to be exposed to the 
10 hydrophilic solvent. The hypervariability of these 
solvent -exposed residues among the various 'strains of 
Borrelia suggested that these amino acid residues may 
contribute to the antigenic variation in OspA. Therefore, 
site-directed mutagenesis was performed to replace some of 
15 the potentially exposed amino acid side chains in the 

protein from one strain with the analogous residues of a 
second strain. The altered proteins were then analyzed by 
Western Blot using monoclonal antibodies which bind OspA on 
the surface of the intact, non-mutated spirochete. The 
20 results indicated that certain specific amino acid changes 
near the tryptophan can abolish reactivity of OspA to these 
monoclonal antibodies. 

L ypHficat^n of cl u stered Polymorphisms in Outer 
Surface Protein A Sequences 

25 Cloning and sequencing of the OspA protein from 

fifteen European and North American isolates (described 
above in Table I) demonstrated that amino acid polymorphism 
is not randomly distributed throughout the protein; rather, 
polymorphism tended to be clustered in three regions of 

30 OspA. The analysis was carried out by plotting the moving, 
weighted average polymorphism of a window (a fixed length 
subsection of the total sequence) as. it is slid along the 
sequence. The window size in this analysis was thirteen 
amino acids, based upon the determination of the largest 
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number of significantly deviating points as established by 
the method of Tajima ( J. Mol . Evol. 33 : 470-473 (1991)). 
The average weighted polymorphism was calculated by summing 
the number of variant alleles for each site. Polymorphism 
5 calculations were weighted by the severity of amino acid 
replacement (Dayhoff, M.O. et al., in: Dayhoff, M.O. <ed.) 
Atlas of Protein Sequence and Structure NBRF , Washington, 
Vol . 5 , SuppI . 3 : 345 (1978)). The sum was normalized by 
the window size and plotted. The amino acid sequence 
10 position corresponds to a window that encompasses amino 
acids 1 through 13 . Bootstrap resampling was used to 
generate 95% confidence intervals on the sliding window 
analysis. Since Borrelia has been shown to be clonal, the 
bootstrap analysis should give a reliable estimate of the 
15 expected variance out of polymorphism calculations. The 

bootstrap was iterated five hundred times at each position, 
and the mean was calculated from the sum of all positions. 
The clonal nature of Borrelia ensures that the stochastic 
variance that results from differing genealogical histories 
20 of the sequence positions (as would be expected if 
recombination were prevalent) will be minimized. 

This test verified that the three regions around the 
observed peaks all have significant excesses of 
polymorphism. Excesses of polymorphism were observed in 
25 the regions including amino acid residues 132-145, residues 
163-177, and residues 208-221 (Figure 3). An amino acid 
alignment between residues 200 and 220 for B31, K48 and the 
four site-directed, mutants is shown in Figure 4. The amino 
acid 208-221 region includes the region of OspA which has 
3 0 been modeled as an oriented alpha-helix in which the single 
tryptophan residue at amino acid 216 is buried in a 
hydrophobic pocket, thereby exposing more polar amino acids 
to the solvent (Figure 5) (France, L.L. , et al . , Fiochem. 
Bioohvs. Acta 1120 : 59 (1992)). These potentially solvent - 
3 5 exposed residues showed considerable variability among the 
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OspAs from various strains and may be an important 
component of OspA antigenic variation. For the purposes of 
generating chimeric proteins, the hypervariable domains of 
interest are Domain A , which includes amino acid residues 
5 120-140 of OspA; Domain B . which includes residues 150-180; 
and Domain C , which includes residues 200-216 or 217. 

B site-Directed Mutagenesis of the Hy pervariable Region 

Site -directed mutagenesis was performed to convert 
residues within the 204-219 domain of the recombinant B31 

10 OspA to the analogous residues of a European OspA variant, 
K48. In the region of OspA between residues 204 and 219, 
which includes the helical domain (amino acids 204-217), 
there are seven amino acid differences between 0spA-B31 and 
OspA-K48. Three oligonucleotides were generated, each 

15 containing nucleotide changes which would incorporate K48 
amino acids at their analogous positions in the B31 OspA 
protein. The oligos used to create the site-directed 

mutants were : 

5 ' - CTTAATGACTCTGACACTAGTGC- 3 ' (#613, which converts 
20 threonine at position 204 to serine, and serine at 206 to 
threonine (Thr204-Ser, Thr206-Ser)) (SEQ ID NO. 1) ; 

5 ' -GCTACTAAAAAAACCGGGAAATGGAATTCA- 3 ' (#625, which converts 
alanine at 214 to glycine, and alanine at 215 to lysine 
(Ala214-Gly, Ala215-Lys) ) (SEQ ID NO. 2); and 
25 5 ' -GCAGCTTGGGATTCAAAAACATCCACTTTAACA- 3 ' (#640, which 

converts asparagine at 217 to aspartate, and glycine at 
219 to lysine (Asn217-Asp, Gly219-Lys) ) (SEQ ID NO. 3). 

Site-directed mutagenesis was carried out by 
performing mutagenesis with pairs of the above oligos. 
30 Three site-directed mutants were created, each with two 
changes: OspA 613 (Thr204-Ser, Thr206-Ser) , OspA 625 
(Ala214-Gly, Ala215-Lys), and 640 (As.n217-Asp, Gly2l9-Lys) . 
There were also two proteins with four changes: OspA 
613/625 (Thr204-Ser, Thr206-Ser, Ala214-Gly, Ala215-Lys) 
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and OspA 613/640 (Thr204-Ser, Thr206-Ser, Asn217-Asp / 
Gly219-Lys) . 



Specificity of Antibody Binding to Epitopes of the 
Non-mutated Hypervariable Region 
5 Monoclonal antibodies that agglutinate spirochetes, 

including several which are neutralizing in vitro, 
recognize epitopes that map to the hypervariable region 
around Trp216 (Barbour, A.G. et al.. Infect, and Immun. 41 : 
759 (1983); Schubach, W.H. et al . , Infect, and Immun. 59 : 

10 1911 (1991) ) . Western Blot analysis demonstrated that 
chemical cleavage of OspA from the B31 strain at Trp 216 
abolishes reactivity of the protein with the agglutinating 
Mab 105, a monoclonal raised against B31 spirochetes (data 
not shown) . The reagent, n-chlorosuccinimide (NCS) , 

15 cleaves OspA at the Trp 216, forming a 23.2kd fragment and 
a 6.2kd peptide which is not retained on the Imobilon-P 
membrane after transfer. The uncleaved material binds Mab 
105; however, the 23.2kd fragment is unreactive. Similar 
Western blots with a TrpE-OspA fusion protein containing 

20 the carboxy- terminal portion of the OspA protein 

demonstrated that the small 6.2kd piece also fails to bind 
Mab 105 (Schubach, W.H. et al., Infect, and Immun. 59 : 1911 
(1991)). 

Monoclonal antibodies H5332 and H3TS (Barbour, A.G. et 
25 al., Infect, and Immun. 41 : 759 .(1983)) have been shown by 
immunofluorescence to decorate the surface of fixed 
spirochetes (Wilske, B. et al., World J. Microbiol. 7 : 130 
(1991) ) . These monoclonals also inhibit the growth of the 
organism in culture. Epitope mapping with fusion proteins 
30 has confirmed that the epitopes which bind these Mabs are 
conf ormationally determined and reside in the carboxy half 
of the protein. Mab H5332 is cross-reactive among all of 
the known phylogenetic groups, whereas Mab H3TS and Mab 105 
seem to be specific to the B31 strain to which they were 
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raised. Like Mab 105, the reactivities of H5332 and H3TS 
to OspA are abrogated by fragmentation of the protein at 
Trp216 (data not shown) . Mab 336 was raised to whole 
spirochetes of the strain P/Gau. It cross-reacts to OspA 
from group 1 (the group to which B31 belongs) but not to 
group 2 (of which K48 is a member) . .« Previous studies using 
fusion proteins and chemical cleavage have indicated that 
this antibody recognizes a domain of OspA in the region 
between residues 217 and 273 (data not shown) . All of 
these Mabs will agglutinate the B31 spirochete. 



20 



Western Blot Analysis of Antibody Binding to Mutated 
Hypervariable Regions 

Mabs were used for Western Blot analysis of the site- 
directed OspA mutants induced in E.coli using the T7 
15 expression system (Dunn, J.J. et al.. Protein Expression 
an H Purification 1 : 159 (1990)). E. coli cells carrying 
Pet 9c plasmids having a site -directed OspA mutant insert 
were induced at mid-log phase growth with IPTG for four 
hours at 37°C. Cell lysates were made by boiling an 
aliquot of the induced cultures in SDS gell loading dye, 
and this material was then loaded onto a 12% SDS gell 
(BioRad mini -Protean II) , and electrophoresed. The 
proteins were then transferred to Imobilon-P membranes 
(Millipore) 70V, 2 hour at 4°C using the BioRad mini 
25 transfer system. Western analysis was carried out as 
described by Schubach et al. (Infect. Immun. 59: 1911 
(1991) ) . 

Western Blot analysis indicated that only the 625 
mutant (Ala214-Gly and Ala215-Lys) retained binding to the 
30 ' agglutinating monoclonal H3TS (data not shown). However, 
the 613/625 mutant which has additional alterations to the 
amino, terminus of Trp216 (Ser204-Thr and Thr206-Ser) did 
not bind this monoclonal. Both 640 and 613/640 OspAs which 
have the Asn217-Asp and Gly219-Lys changes on the carboxy- 
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terminal side of Trp216 also failed to bind Mab H3TS. This 
indicated that the epitope of the B31 OspA which binds H3TS 
is comprised of amino acid side-chains on both sides of 
Trp216. 

5 The 613/625 mutant failed to bind Mabs 105 and H5332, 

while the other mutants retained their ability to bind 
these Mabs. This is important in light of the data using 
fusion proteins that indicate that Mab 105 behaves more 
like Mab H3TS in terms of its serotype specificity and 
10 binding to OspA (Wilske, B. et al . , Med. Microbiol. 

Immunol. 181 : 191 (1992)). The 613/625 protein has, in 
addition to the differences at residues Thr204 and Ser206 / 
changes immediately amino- terminal to Trp216 (Ala214-Gly 
and Ala215-Lys) . The abrogation of reactivity of Mabs 105 
15 and H5332 to this protein indicated that the epitopes of 

OspA which bind these monoclonals are comprised of residues 
on the amino-terminal side of Trp216. 

The two proteins carrying the Asn217-Asp and Gly219- 
Lys replacements on the carboxy- terminal side of Trp216 
20 (OspAs 640 and 613/640) retained binding to Mabs 105 and 
H5332; however, they failed to react with Mab 336, a 
monoclonal which. has been mapped with TrpE-OspA fusion 
proteins and by chemical cleavage to a more carboxy- 
terminal domain. This result may explain why Mab 336 
25 failed to recognize the K48-type of OspA (Group 2) . 

It is clear that amino acids Ser204 and Thr206 play an 
important part in the agglutinating epitopes in the region 
of the B31 OspA flanking Trp216 . Replacement of these two 
residues altered the epitopes of OspA that bind Mabs 105, 
30 H3TS and H5332. The ability of the 640 changes alone to 
abolish reactivity of Mab 336 indicated that Thr204 and 
Ser206 are not involved in direct interaction with Mab 336. 

The results indicated that the epitopes of OspA which 
are available to Mabs that agglutinate spirochetes are 
3 5 comprised at least in part by amino acids in the immediate 
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vicinity of Trp2l6 . Since recent circular dichroism 
analysis indicated that the structures of B31 and K48 OspA 
differ very little within this domain, it is unlikely that 
the changes made by mutation have radically altered the 
overall structure of the OspA protein (France, L.L. et al . , 
D ^ om ^nnhvc »r.r.a 1120: 59 (1992); and France et al.. 
pi ^ m p^hvB Acta, submitted (1993)). This hypothesis 
is supported by the finding that the recombinant, mutant 
OspAs exhibit the same high solubility and purification 
properties as the parent B31 protein (data not shown) . 

in summary, amino acid side-chains at Ser204 and 
Thr206 are important for many of the agglutinating 
epitopes. However, a limited set of conservative changes 
at these sites were not sufficient to abolish binding of 
15 all of the agglutinating Mabs . These results suggested 
that the agglutinating epitopes of OspA are distinct, yet 
may have some overlap. The results also supported the 
hypothesis that the surf ace -exposed epitope around Trp2l6 
which is thought to be important for immune recognition and 
neutralization is a conformationally-determined and complex 
domain of OspA. 



20 



F.XAMPLE 3. p^r-r-Mia Strains and Proteins 

Proteins and genes from any strain of Borrelia can be 
utilized in the current invention. Representative strains 
25 are summarized in Table I, above. 

A_ r,Pn PS EncnH^a BorreTia Proteins 

The chimeric peptides of the current invention can 
comprise peptides derived from any Borrelia proteins. 
Representative proteins include OspA, OspB, OspC OspD. 
30 P 12, P 39, p41 (fla). p«. and P 93. Nucleic acid 

encoding several Borrelia proteins are presently available 
(see Table II, below); alternatively, nucleic acid 
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sequences encoding Borrelia proteins can be isolated and 
characterized using methods such as those described below. 



Table II. References for Nucleic Acid Sequences for Several 
Proteins of Various Borrelia Strains 



Strai 
n 


P 93 


OspA 


p41 (fla) 


K48 


X69602 (SID 67) 


X62624 (SID 8) 


X69610 (SID 49) 


PGau 


SID 73 


X62387 (SID 10) 


X69612 (SIO 51) 


DK29 




X63412 (SID 137) 


X69608 (SID 53) 


PKo 


X69803 (SID 77) 


X65599 (SID 141} 


Ab Sol J (SXD 
131) 


PTrob 


X69604 (SID 71) 


X65598 (SID 135) 


X69614 (SID 55) 


Ip3 




X70365 (SID 140) 




Ip90 


ND 


Kryuchechnikov , V.N. 
et al., J.Microbiol. 
Epid. Immunobiol. 
12:41-44 (1988) (SID 
138) 




25015 


X70365 (SID 75) 


Fikrig, E.S. et al., 
J. Immunol. 7:2256- 
2260 1992) 
SID 12) 




B31 


Perag, G.C. et 
al . , Infect . 


Bergstrom, S. et 

ai . * rlOX . nlCrODlOl * 


Gassmann, G.S. 


Immun. 59:2070- 


3:479-486 (1989) 
(SID 6) 


Acids Res. 17: 


74 (1992); 
Luft, B.J. et 
al . „ Infect . 
Immun. 60:4309- 
4321 (1992) 
(SID 65) 


3590 (1989) 
(SID 127) 


PKal 




X69606 (SID -132) 


X69611 (SID 
129) 


ZS7 




Jonsson, M. et al . , 
Infect. Immun. 
60:1845-1853 (1992) 
(SID 134) 




N40 




Kryuchechnikov, V.N. 
et al. (SID 133) 




PHei 




X65600 (SID 136) 




ACAI 




Kryuchechnikov, V.N. 
et al. (SID 142) 




PBo 


X69601 (SID 69) 


X65605 (SID 139) 


X69610 (SID 
130) 



Numbers with an "X" prefix are GenBank data base accession numbers. 
SID = SEQ ID NO. 
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B. isolation of Bor relia Genes 

Nucleic acid sequences encoding full length, lipidated 
proteins from known Borrelia strains were isolated using 
the polymerase chain reaction (PCR) as described below. In 
5 addition, nucleic acid sequences were generated which 
encoded truncated proteins (proteins- in which the 
lipidation signal has been removed, such as by eliminating 
the nucleic acid sequence encoding the first 18 amino 
acids, resulting in non-lipidated proteins). Other 
10 proteins were generated which encoded polypeptides of a 
particular gene (i.e., encoding a segment of the protein 
which has a different number of amino acids than the 
protein does in nature) . Using similar methods as those 
described below, primers can be generated from known 
15 nucleic acid sequences encoding Borrelia proteins and used 
to isolate other genes encoding Borrelia proteins. Primers 
can be designed to amplify all of a gene, as well as to 
amplify a nucleic acid sequence encoding truncated protein 
sequences, such as described below for OspC, or nucleic 
20 acid sequences encoding a polypeptide derived from a 
Borrelia protein. Primers can also be designed to 
incorporate unique restriction enzyme cleavage sites into 
the amplified nucleic acid sequences. Sequence analysis of 
the amplified nucleic acid sequences can then be performed 
25 using standard techniques. 

Cloning and Sequencing of OspA Genes and Relevant 
Nucleic Acid Sequences 

Borrelia OspA sequences were isolated in the following 
manner: 100 fil reaction mixtures containing 50 mM KCl, 10 
30 mM TRIS-HC1 (pH 8,3), 1.5 mM MgCl 2 , 200 pM each NTP, 2.5 

units of TaqI DNA polymerase (Amplitaq, Perkin-Elmer/Cetus) 
and 100 pmol each of the 5' and 3' primers (described 
below) were used. Amplification was performed in a Perkin- 
Elmer/Cetus thermal cycler as described (Schubach, W.H. et 
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al., Infect. Immun . 59 :1811-1915 (1991)). The amplicon was 
visualized on an agarose gel by ethidium bromide staining. 
Twenty nanograms of the chloroform-extracted PCR product 
were cloned directly into the POTA vector (Invitrogen) by 
5 following the manufacturer's instructions. Recombinant 
colonies containing the amplified fragment were selected, 
the plasmids were prepared, and the nucleic acid sequence 
of each OspA was determined by the dideoxy chain-, 
termination technique using the Sequenase kit (United 
10 States Biochemical) . Directed sequencing was performed 

with M13 primers followed by OspA-specif ic primers derived 
from sequences, previously obtained with M13 primers. 

Because the 5' and 3' ends of the OspA gene are highly 
conserved (Fikrig, E.S. et al., J. Immunol. 7 :2256-2260 
15 (1992); Bergstrom, S. et al . , Mol . Microbiol . 3 : 479-486 

(1989); Zumstein, G. et al . , Med. Microbiol. Immunol. 181: 
57-70 (1992)), the 5' and 3' primers for cloning can be 
based upon any known OspA sequences. For example, the 
following primers based upon the OspA nucleic acid sequence 
20 from strain B31 were used: 

5 ' -GGAGAATATATTATGAAA-3 ' (-12 to +6) (SEQ ID NO. 4); and 
5 ' -CTCCTTATTTTAAAGCG- 3 ' (+826 to +809) (SEQ ID NO. 5). 
(Schubach, W.H. et al . , Infect. Immun 59 :1811-1915 (1991)). 
OspA genes isolated in this manner include those for 
25 strains B31, K48, PGau, and 25015; the nucleic acid 

sequences are depicted in the sequence listing as SEQ ID 
NO. 6 (OspA-B31), SEQ ID NO. 8 (OspA-K48) , SEQ ID NO. 10 
(OspA-PGau), and SEQ ID NO. 12 (OspA-25015) . An alignment 
of these and other OspA nucleic acid sequences is shown in 
30 Figure 42. The amino acid sequences of the proteins 

encoded by these nucleic acid sequences are represented as * 
SEQ ID NO. 7 (OspA-B31) , SEQ ID NO. 9 (OspA-K48) , SEQ ID 
NO. 11 (OspA-PGau), and SEQ ID NO. 13 (OspA-25015). 

The following primers were used to generate specific 
3 5 nucleic acid sequences of the OspA gene, to be used to 
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generate chimeric nucleic acid sequences (as described in 
Example 4) : 

5 ' -GTCTGCAAAAACCATGACAAG- 3 ' (plus strand primer #3 69) (SEQ 
ID NO. 14) ; 

5 5 ' -GTCATCAACAGAAGAAAAATTC-3 ' (plus strand primer #357) 
(SEQ ID NO 15) ; 

5' -CCGGATCCATATGAAAAAATATTTATTGGG-3 ' (plus strand primer 

#607) (SEQ ID NO. 16) ; 
5' -cCGGGATCCATATGGCTAAGCAAAATGTTAGC-3 ' (plus strand primer 

10 #584) (SEQ ID NO. 17) ; 

5 ' -GCGTTCAAGTACTCCAGA-3 ' (minus strand primer #200) (SEQ 

ID NO. 18) ; 

5' -GATATCTAGATCTTATTTTAAAGCGTT-3 ' (minus strand primer 

#586) (SEQ ID NO. 19); and 
IS 5 ' -GGATCCGGTGACCTTTTAAAGCGTTTTTAAT-3 ' (minus strand primer 

#1169) (SEQ ID NO. 20) . 

Cloning and Sequencing of OspB 

Similar methods were also used to isolate OspB genes. 
One OsoB genes isolated is represented as SEQ ID NO. 21 
20 (OspB-B31); its encoded amino acid sequence is SEQ ID NO. 
22 . 

The following primers were used to generate specific 
nucleic acid sequences of the OspB gene, to be used in 
generation of chimeric nucleic acid sequences (see Example 

25 4 ) : 

5 ' -GGTACAATTACAGTACAA-3 ' (plus strand primer #721) (SEQ ID 
NO. 23) ; 

5' - CCGAGAATCTCATATGGCACAAAAAGGTGCTGAGTCAATTGG - 3 ' (plus 

strand primer #1105) (SEQ ID NO. 24) ; 
30 5 ' -cCGATATCGGATCCTATTTTAAAGCGTTTTTAAGC-3 ' (minus strand 

primer # 1106) (SEQ ID NO. 25) ; and 
5 ' -GGATCCGGTGACCTTTTAAAGCGTTTTTAAG- 3 ' (minus strand primer 

#1170) (SEQ ID NO. 26) . 
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Cloning and Sequencing of OspC 

Similar methods were also used to isolate OspC genes. 
The following primers were used to isolate entire OspC 
genes from Borrelia strains B31, K48, PKO, and pTrob: 
5 5' -GTGCGCGACCATATGAAAAAGAATACATTAAGTGCG-3' (plus strand 
primer having Ndel site combined with start codon) (SEQ ID 
NO. 27) , and 

5 ' -GTCGGCGGATCCTTAAGGTTTTTTTGGACTTTCTGC-3 ' (minus strand 
primer having BamHl site followed by stop codon) (SEQ ID 
10 NO. 28) . 

The nucleic acid sequences of the OspC genes were then 
determined by the dideoxy chain- termination technique using 
the Sequenase kit (United States Biochemical) . OspC 
genes isolated and sequenced in this manner include those 
15 for strains B31, K48, PKo, and Tro; the nucleic acid 

sequences are depicted in the sequence listing as SEQ ID 
NO. 29 (OspC-B31), SEQ ID NO. 31 (OspC-K48) , SEQ ID NO. 33 
(OspC-PKo) . and SEQ ID NO. 35 ( OspC- Tro ) . An alignment of 
these sequences is shown in Figure 38. The amino acid 
20 sequences of the proteins encoded by these nucleic acid 

sequences are represented as SEQ ID NO. 30 (OspC-B31) , SEQ 
ID NO. 32 (OspC-K48) , SEQ ID NO. 34 (OspC-PKo) , and SEQ ID 
NO. 36 (OspC-Tro) . 

Truncated OspC genes were generated using other 
25 primers. These primers were designed to amplify nucleic 

acid sequences, derived from the. OspC gene, that lacked the 
nucleic acids encoding the signal peptidase sequence of the 
full-length protein. The primers corresponded to bp 58-75 
of the natural protein, with a codon for Met -Ala attached 
30 ahead. For strain B31, the following primer was used: 
5 ' -GTGCGCGACCATATGGCTAATAATTCAGGGAAAGAT-3 ' (SEQ ID NO. 

37) . 

For strain PKo, 
5 ' -GTGCGCGACCATATGGCTAGTAATTCAGGGAAAGGT-3 ' (SEQ ID NO. 38) 

35 was used. 
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For strains pTrob and K48, 
5 ' -GTGCGCGACCATATGGCTAATAATTCAGGTGGGGAT- 3 ' (SEQ ID NO. 39) 

was used. 

Additional primers were also designed to amplify 
5 nucleic acids encoding particular polypeptides, for use in 
creation of chimeric nucleic acid sequences (see Example . 
4) . These primers included: 
5 ' - CTTGGAAAATTATTTGAA- 3 ' (plus strand primer #520) (SEQ ID 

NO. 40) ; 

10 5 ' -CACGGTCACCCCATGGGAAATAATTCAGGGAAAGG-3 ' (plus strand 

primer #58) (SEQ ID NO. 41) ; 
5' -TATAGATGACAGCAACGC-3 ' (minus strand primer #207) (SEQ 

ID NO. 42) ; and 
" 5 ' -CCGGTGACCCCATGGTACCAGGTTTTTTTGGACTTTCTGC-3 ' [minus 

15 strand primer #636) (SEQ ID NO. 43) . 

Cloning and Sequencing of OspD 

Similar methods can be used to isolate OspD genes. An 
alignment of four OspD nucleic acid sequences (from strains 
P Bo, PGau, DK29, and K48) is shown in Figure 39. 

20 Cloning and Sequencing of pl2 

The P 12 gene was similarly identified. Primers used 
to clone the entire P 12 gene included: 5'- 

CCGGATCCATATGGTTAAAAAAATAATATTTATTTC- 3 ' (forward primer # 

757) (SEQ ID NO. 44); and 5'- 
25 GATATCTAGATCTTTAATTGCTCTGCTCACTCTCTTC - 3 ' (reverse primer 

#758) (SEQ ID NO. 45) . 

To amplify a truncated pl2 gene (one in which the 
transcribed protein is non-lipidated, and begins at amino 
acid 18 of the native sequence) , the following primers were 
30 used- 5 ' -CCGGGATCCATATGGCTAGTGCAATTGGTCGTGG-3 ' (forward 

primer # 759) (SEQ ID NO. 46); and primer #758 (SEQ ID NO. 
45) . 
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Cloning and Sequencing of p41 (fla) 
A similar approach was used to clone and sequence 
genes encoding the p41 (fla) protein. The p4l sequences 
listed in Table II with GenBank accession numbers were 
5 isolated using the following primers from strain B31: 

5 ' - ATGATTATCAATCATAAT- 3 ' { + 1 to + 18> (SEQ ID NO. 47); and 
5 ' -TCTGAACAATGACAAAAC- 3 ' ( + 1008 to +991) (SEQ ID NO. 48). 
The nucleic acid sequences of p41 isolated in this manner 
are depicted in the sequence listing as SEQ ID NO. 51 (p41- 
10 PGau) , and SEQ ID NO. 53 (p41-DK29) . An alignment of 
several p41 nucleic acid sequences, including those for 
strains B31, pKal, PGau, pBo, DK29, and pKo, is shown in 
Figure 41. The amino acid sequences of the proteins 
encoded by these nucleic acid sequences are represented as 
15 SEQ ID NO. 50 (p41-K48) , SEQ ID NO. 52 (p41-PGau) , SEQ ID 
NO. 54 (p41-DK29) , SEQ ID NO. 56 (p41-PTrob) , and SEQ ID 
NO. 58 (p41-PHei) . 

Other primers were designed to amplify nucleic acid 
sequences encoding polypeptides of p41, to be used in 
20 chimeric nucleic acid sequences. These primers included: 
5' - TTGGATCCGGTCACCCCATGGCTCAATATAACCAATG - 3 ' (minus strand 
primer #122) (SEQ ID NO. 59); 

5' -TTGGATCCGGTCACCCCATGGCTTCTCAAAATGTAAG-3 ' (plus strand 
primer # 140) (SEQ ID NO. 60); 
25 5 ' -TTGGATCCGGTGACCAACTCCGCCTTGAGAAGG-3 ' (minus strand 
primer # 234) (SEQ ID NO. 61); and 

5 ' -TTGGATCCGGTGACCTATTTGAGCATAAGATGC-3 ' (minus strand 
primer #141) (SEQ ID NO. 62) . 

Cloning and Sequencing of p93 
3 0 The same approach was also used to clone and sequence 

p93 protein. Genes encoding p93, as listed in Table II 
with GenBank accession numbers, were isolated by this 
method with the following primers from strain B31: 
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5 ' -GGTGAATTTAGTTGGTAAGG-3 ' (-54 to -35) (SEQ ID NO. 63); 
and 

5' -CACCAGTTTCTTTAAGCTGCTCCTGC-3 ' ( + 1117 to +1092) (SEQ ID 
NO. 64) . 

5 The nucleic acid sequences of p93 isolated in this 

manner are depicted in the sequence fisting as SEQ ID NO. 
65 (p93-B31), SEQ ID NO. 67 (p93-K48) SEQ ID NO. 69 (p93-* 
PBo), SEQ ID NO. 71 (p93-PTrob) , SEQ ID NO. 73 (p93-PGau) , 
SEQ ID NO. 75 (p93-25015) , and SEQ ID NO. 77 (p93-PKo) . 

10 The amino acid sequences of the proteins encoded by these 
nucleic acid sequences are represented as *SEQ ID NO. 66 
(p93-B31), SEQ ID NO. 68 (p93-K48) SEQ ID NO. 70 (p93-PBo), 
SEQ ID NO. 72 (p93-PTrob) , SEQ ID NO. 74 (p93-PGau) / SEQ ID 
NO. 76 (p93-25015), and SEQ ID NO. 78 (p93-PKo) . 

15 Other primers were used to amplify nucleic acid 

sequences encoding polypeptides of p93 to be used in 
generating chimeric nucleic acid sequences. These primers 
included: 

5' -CCGGTCACCCCATGGCTGCTTTAAAGTCTTTA-3 ' (plus strand primer 
20 #475) (SEQ ID NO. 79); 

5 ' -CCGGTCACCCCATGAATCTTGATAAAGCTCAG-3 ' (plus strand primer 

' #900) (SEQ ID NO. 80) ; 

5 ' -CCGGTCACCCCATGGATGAAAAGCTTTTAAAAAGT-3 ' (plus strand 

primer #1168) (SEQ ID NO. 81); 
25 5 ' -CCGGTCACCCCCATGGTTGAGAAATTAGATAAG-3 ' (plus strand 
primer #1423) (SEQ ID NO. 82); and 

5 ' - TTGGATCCGGTGACCCTTAACTTTTTTTAAAG - 3 ' (minus strand 
primer # 2100) (SEQ ID NO. 83). 

C. Expression of Proteins from Borrelia Genes 
3 0 The nucleic acid sequences described above can be 

incorporated into expression plasmids, using standard 
techniques, and transfected into compatible host cells in 
order to express the proteins encoded by the nucleic acid 
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sequences. As an example, the expression the pl2 gene and 
the isolation of pl2 protein is set forth. 

Amplification of the pl2 nucleic acid sequence was 
conducted with primers that included a Ndel restriction 
5 site into the nucleic acid sequence. The PCR product was 
extracted with phenol /chloroform and > precipitated with 
ethanol. The precipitated product was digested and ligated 
into an expression plasmid as follows: 15 /xl 
(approximately 1 ng) of PCR DNA was combined with 2 /xl 10X 
10 restriction buffer for Ndel (Gibco/BRL) , 1 /xl Ndel 
(Gibco/BRL) , and 2 /xl distilled water, and incubated 
overnight at 37°C. This mixture was subsequently combined 
with 3 /xl 10X buffer (buffer 3, New England BioLabs) , 1 /xl 
BamHI (NEB) , and 6 /xl distilled water, and incubated at 37° 
15 for two hours. The resultant material was purified by 
preparative gel electrophoresis using low melting point 
agarose, and the band was visualized under long wave 
ultraviolet light and excised from the gel. The gel slice 
was treated with Gelase using conditions recommended by the 
20 manufacturer (Epicentre Technologies) . The resulting DNA 
pelled was resuspended in 25-50 /xl of 10 mM TRIS-CL (pH 
8.0) and 1 mM EDTA (TE) . An aliquot of this material was 
ligated into the Pet 9c expression vector (Dunn, J. J- et 
al., Protein E x pression and Pu rification 1: 159 (1990)). 
25 To ligate the material into the Pet9c expression 

vector. 20-50 ng of pl2 nucleic acid sequences cut and 
purified as described above was combined with 5 /xl 10 0ne " 
Phor-All (OPA) buffer (Pharmacia) , 30-60 ng Pet9c cut with 
Ndel and BamHI, 2.5 /xl 20 mM ATP, 2 /xl T4 DNA ligase 
(Pharmacia) diluted 1:5 in IX OPA buffer, and sufficient 
distilled water to bring the final volume to 50 /xl • The 
mixture was incubated at 12 °C overnight. 

The resultant ligations were transformed into 
competent DH5- alpha cells and plated on nutrient agar 
35 plates containing 50 /xg/ml kanamycin and incubated 
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overnight at 37 °C. DH5- alpha is used as a "storage 
strain" for T7 expression clones, because it is RecA 
deficient, so that recombination and concatenation are not 
problematic, and because it lacks the T7 RNA polymerase 
gene necessary to express the cloned gene. The use of this 
strain allows for cloning of potentially toxic gene 
products while minimizing the chance of deletion and/or 
rearrangement of the desired genes. Other cell lines 
having similar properties may also be used. 

Kanamycin resistant colonies were single -colony 
purified on nutrient agar plates supplemented with 
kanamycin at 50 fig/ml . A colony from each isolate was 
inoculated into 3-5 ml of liquid medium containing 50 M g/ral 
kanamycin, and incubated at 37°C without agitation. 
15 Plasmid DNA was obtained from 1 ml of each isolate using a 
hot alkaline lysis procedure (Mantiatis, T. et al., 
MolP.cular Clnninn : A Lf ^f.nrv Manual, cold Spring Harbor 
Laboratory, Cold Spring Harbor, NY (1982)). 

Plasmid DNA was digested with EcoRI and Bglll in the 
20 following manner: 15 #tl plasmid DNA was combined with 2 /il 
10X buffer 3 (NEB) , 1 EcoRI (NEB) , 1 fil Bglll (NEB) and 1 
111 distilled water, and incubated for two hours at 37°C. 
The entire reaction mixture was electrophoresed on an 
analytical agarose gel. Plasmids carrying the P 12 insert 
25 were identified by the presence of a band corresponding to 
925 base-pairs (full length pl2) or 875 base-pairs 
(nonlipidated pl2) . 

One or two plasmid DNAs from the full length and 
nonlipidated pl2 clones in Pet9c were used to transform 

30 BL21 DE3 pLysS to kanamycin resistance as described by 

Studier et al. < Methods i n F.nsvmoloqy. Goeddel , D. (Ed.), 
Academic Press, 185: 60-89 (1990)). One or two 
transformants of the full length and nonlipidated clones 
were single -colony purified on nutrient plates containing 

35 25 /igMl chloramphenicol (to maintain pLysS) and 50 f*g/ml 
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kanamycin at 37 °C. One colony of each isolate was 
inoculated into liquid medium supplemented with 
chloramphenicol and kanamycin and incubated overnight at 
37°C. The overnight culture was subcultured the following 
5 morning into 500 ml of liquid broth with chloramphenicol 

(25 /xg/ml) *and kanamycin (50 tig/ml) .and grown with aeration 
at 37°C in an orbital air-shaker until the absorbance at 
600 nm reached 0.4-0.7. Isopropyl-thio-galactoside .(IPTG) 
was added to a final concentration of 0.5 mM, for 
10 induction, and the culture was incubated for 3-4 hours at 
37° as before. The induced cells were pelleted by 
centrifugation and resuspended in 25 ml of 20 mM NaP0 4 (pH 
7.7) . A small aliquot was removed for analysis by gel 
electrophoresis. Expressing clones produced proteins which 
15 migrated at the 12 kDa position. 

A crude cell lysate was prepared from the culture as 
described for recombinant OspA by Dunn, J.J. et al., 
( Protein Expression and Purification 1 : 159 (1990)). The 
crude lysate was first passed over a Q-sepharose column 
20 (Pharmacia) which had been pre-equilibrated in Buffer A: 

10 mM NaP0 4 (pH 7.7), 10 mM NaCl , 0 . 5 mM PMSF. The column 
was washed with 10 mM NaP0 4 , 50 mM NaCl and 0.5 tnM PMSF and 
then pl2 was eluted in 10 mM NaP0 4 , 0.5 mM PMSF with a NaCl 
gradient from 50-400 mM. pl2 eluted approximately halfway 
25 through the gradient between 100 and 200 mM NaCl. The peak 
fractions were pooled and dialyzed against 10 mM NaPo4 (pH 
7.7), 10 mM NaCl, 0 . 5 mM PMSF. The protein was then 
concentrated and applied to a Sephadex G50 gel filtration 
column of approximately 50 ml bed volume (Pharmacia) , in 10 
3 0 mM NaP0 4 , 200 mM NaCl, 0.5 mM PMSF. p!2 would typically 
elute shortly after the excluded volume marker. Peak 
fractions were determined by running small aliquots of all 
fractions on a gel. The pl2 peak was pooled and stored in 
small aliquots at -20°C. 
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Rxample 4. flpnaration nf Chimeric Nucleic Acid 

gpr piences and Chimer ic Proteins 

^ npnpral Protocol fm- Creati o n of Chimeric Nucleic Acid 
Se quences 

5 The megaprimer method of site directed mutagenesis and 

its modification were used to generate chimeric nucleic 
acid sequences (Sarkar and Sommer, Bi ot.e.chnigues 8(4) : 404- 
407 (1990) ; Aiyar, A. and J. Leis, Biotechnigues 14 (3) : 
366-369 (1993)). A 5' primer for the first genomic 
10 template and a 3' fusion oligo are used to amplify the 

desired region. the fusion primer consists of a 3' end of 
the first template (DNA that encodes the amino -proximal 
polypeptide of the fusion protein), coupled to a 5' end of 
the second template (DNA that encodes the carboxy-proximal 
15 polypeptide of the fusion protein) . 

The PCR amplifications are performed using Taq DNA 
polymerase, 10X PCR buffer, and MgCl 2 (Promega Corp., 
Madison, WI) , and Ultrapure dNTPs (Pharmacia, Piscataway, 
NJ) . One fig of genomic template 1, 5 (i of 10 /(M 5' oligo 
20 and 5 nl °f 10 MM fusion oligo are combined with the 

following reagents at indicated final concentrations: 10X 
Buffer-Mg FREE (IX), MgCl 2 (2 mM) . dNTP mix (200 M M each 
dNTP), Taq DNA polymerase (2.5 units), water to bring final 
volume to 100 nl. A Thermal Cycler (Perkin Elmer Cetus, 
25 Norwalk, CT) is used to amplify under the following 

conditions: 35 cycles at 95°C for one minute, 55°C for two 
minutes, and 72° for three minutes. This procedure results 
in a "megaprimer" . 

The resulting megaprimer is run on a IX TAE, 4% low- 
30 melt agarose gel. The megaprimer band is cut from the gel 
and purified using the Promega Magic PCR Preps DNA 
purification system. Purified megaprimer is then used in a 
second PCR step. One fig of genomic template 2, 
approximately 0.5 fig of the megaprimer, and 5 /i of 10 (iM 3' 
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oligo are added to a cocktail of 10X buffer, MgCl 2 , dNTPs 
and Taq at the same final concentrations as noted above, 
and brought to 100 pi with water. PCR conditions are the 
same as above. The fusion product resulting from this 
5 amplification is also purified using the Promega Magic PCR 
Preps DNA purification system. 

The fusion product is then ligated into TA vector and 
transformed into E. coli using the Invitrogen (San Diego, 
CA) TA Cloning Kit. Approximately 50 ng of PCR fusion 
10 product is ligated to 50 ng of pCRII vector with IX 

Ligation Buffer, 4 units of T4 ligase, and "brought to 10 Nl 
with water. This ligated product mixture is incubated at 
12 °C overnight (approximately 14 hours) . Two pi of the 
ligation product mixture is added to 50 /xl competent INC F' 
15 cells and 2 p beta mercaptoethanol . The cells are then 

incubated for 3 0 minutes, followed by heat shock treatment 
at 42 °C for 6 0 seconds, and an ice quenching for two 
minutes. 450 /xl of warmed SOC media is .then added to the 
cells, resulting in a transformed cell culture which is 
20 incubated at 37°C for one hour with slight shaking. 50 pi 
of the transformed cell culture is plated on LB + 50 pg/pl 
ampicillin plates and incubated overnight at 37°C. Single 
white colonies are picked and added to individual overnight 
cultures containing 3 ml LB with ampicillin (50 pg/pl) • 
25 The individual overnight cultures are prepared using 

Promega' s Magic Miniprep DNA purification system. A small 
amount of the resulting DNA is cut using a restriction 
digest as a check. DNA sequencing is then performed to 
check the sequence of the fusion nucleic acid sequence, 
30 using the United States Biochemical (Cleveland, OH) 

Sequenase Version 2.0 DNA sequencing kit. Three to five pg 
of plasmid DNA is used per reaction. 2 pi 2M NaOH/2mM EDTA 
are added to the DNA, and the volume is brought to 20 pi 
with water. The mixture is then incubated at room 
35 temperature for five minutes. 7 pi water, 3 pi 3M NaAc, 75 
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/xl EtOH are added. The resultant mixture is mixed by 
vortex and incubated for ten minutes at -70°C, and then 
subjected to microfugation. After microfuge for ten 
minutes, the supernatant is aspirated off, and the pellet 
is dried in the speed vac for 30 second. 6 fil water, 2 pi 
annealing buffer, and 2 M l of 10 {M : of the appropriate 
oligo is then added. This mixture is incubated for 10 
minutes at 37 °C and then allowed to stand at room, 
temperature for 10 minutes. Subsequently, 5.5 y.1 of label 
cocktail (described above) is added to each sample of the 
mixture, which are incubated at room temperature for an 
additional five minutes. 3.5 fil labeled DNA is then added 
to each sample which is then incubated for five minutes at 
37°C. 4 Ml stop solution is added to each well. The DNA 
15 is denatured at 95° for two minutes, and then placed on 



10 



ice . 



Clones with the desired fusion nucleic acid sequences 
are then recloned in frame in the pEt expression system in 
the lipidated (full length) and non-lipidated (truncated, 
20 i.e., without first 17 amino acids) forms. The product is 
amplified using restriction sites contained in the PCR 
primers. The vector and product are cut with the same 
enzymes and ligated together with T4 ligase. The resultant 
plasmid is transformed into competent E. coli using 
standard transformation techniques. Colonies are screened 
as described earlier and positive clones are transformed 
into expression cells, such as E. coli BL21, for protein 
expression with IPTG for induction. The expressed protein 
in its bacterial culture lysate form and/or purified form 
30 is then injected in mice for antibody productibn. The mice 
are bled, and the sera collected for agglutination, in 
vitro growth inhibition, and complement- dependent and 
independent lysis tests. 
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B . Specific Chimeric Nucleic Acid Sequences 

Various chimeric nucleic acid sequences were 
generated. The nucleic acid sequences are described as 
encoding polypeptides from Borrelia proteins. The chimeric 
5 nucleic acid sequences are produced such that the nucleic 
acid sequence encoding one polypeptide is in the same 
reading frame as the nucleic acid sequence encoding the 
next polypeptide in the chimeric protein sequence encoded 
by the chimeric nucleic acid sequence. The proteins are 

10 listed sequentially (in order of presence of the encoding 
sequence) in the description of the chimeric nucleic acid 
sequence. For example, if a chimeric nucleic acid sequence 
consists of bp 1-650 from OspA-1 and bp 651-820 from OspA-2 
were sequenced, the sequence of the chimer would include 

15 the first 650 base pairs from OspA-1 followed immediately 
by base pairs 651-820 of OspA-2. 

OspA-K4 8/OspA-PGau A chimer of OspA from strain 
K48 (OspA-K48) and OspA from strain PGau (OspA-PGau) was 
generated using the method described above. This chimeric 
20 nucleic acid sequence included bp 1-654 from OspA-K48, 
followed by bp 655-820 from OspA-PGau. Primers used 
included: the amino-terminal sequence of OspA primer #607 
(SEQ ID NO. 16) ; the fusion primer, 
5 ' -AAAGTAGAAGTTTTTGAATCCCATTTTCCAGTTTTTTT-3 ' (minus strand 
25 primer #668-654) (SEQ ID NO. 84); the carboxy- terminal 
sequence of OspA primer #586 (SEQ ID NO. 19); and the 
sequence primers #369 (SEQ ID NO. 14) and #357 (SEQ ID NO. 
15) . The chimeric nucleic acid sequence is presented as 
SEQ ID NO. 85; the chimeric protein encoded by this 
30 chimeric nucleic acid sequence is presented as SEQ ID NO. 
86 . 



Osr»A-B3 1 /OspA- PGau A chimer of OspA from strain B31 (OspA 
B31) and OspA from strain PGau (OspA-PGau) was generated 
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using the method described above. This chimeric nucleic 
acid sequence included bp 1-651 from OspA-B31, followed by 
bp 652-820 from OspA-PGau. Primers used included: the 
fusion primer, 

5 5' -AAAGTAGAAGTTTTTGAATTCCAAGCTGCAGTTTT-3' (minus strand 
primer #668-651) (SEQ ID NO. 87) ; and the sequence primer, 
#369 (SEQ ID NO. 14) . The chimeric nucleic acid sequence 
is presented as SEQ ID NO. 88; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
10 ID NO. 89. 

OspA-B31/OspA-K48 A chimer of OspA from strain B31 (OspA- 
B31) and OspA from strain K48 (OspA-K48) was generated 
using the method described above. This chimeric nucleic 
acid sequence included bp 1-651 from OspA-B31, followed by 
15 bp 652-820 from OspA-K48. Primers used included: the 
fusion primer, 

5 ' -AAAGTGGAAGTTTTTGAATTCCAAGCTGCAGTTTTTTT-3 ' . (minus strand 
primer #671-651) (SEQ ID NO. 90); and the sequence primer, 
#369 (SEQ ID NO. 14) . The chimeric nucleic acid sequence 
20 is presented as SEQ ID NO. 91; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
ID NO. 92 . 

OspA-B31/OspA-25015 A chimer of OspA from strain B31 (OspA- 
B31) and OspA from strain 25015 (OspA-25015) was generated 
25 using the method described above. This chimeric nucleic 

acid sequence included bp 1-651 from OspA-B31, followed by 
bp 652-820 from OspA-25015. Primers used included: the 
fusion primer, 5 ' -TAAAGTTGAAGTGCCTGCATTCCAAGCTGCAGTTT-3 ' 

(SEQ ID NO. 93) . The chimeric nucleic acid sequence is 
30 presented as SEQ ID NO. 94; the chimeric protein encoded by 

this chimeric nucleic acid sequence is presented as SEQ ID 

NO. 95. 
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OspA-K48/OspA-B3l/OspA-K48 



A chimer of OspA from strain 



B31 (OspA-B31) and OspA from strain K48 (OspA-K48) was 
generated using the method described above. This chimeric 
nucleic acid sequence included bp 1-570 from OspA-B31, 
5 followed by bp 570-651 from OspA-B31, followed by bp 650- 
820 from OspA-K48. Primers used included: the fusion 
primer, 5' -CCCCAGATTTTGAAATCTTGCTTAAAACAAC-3 ' (SEQ ID NO. * 
96); and the sequence primer, #357 (SEQ ID NO, 151. -The 
chimeric nucleic acid sequence is presented as SEQ ID NO. 
10 97; the chimeric protein encoded by this chimeric nucleic 
acid sequence is presented as SEQ ID NO. 98. 

OspA-B3l/OsoA-K4 8/OspA-B3l/OspA-K48 A chimer of OspA 

from strain B31 (OspA-B31) and OspA from strain K48 (OspA- 
K48) was generated using the method described above. This 

15 chimeric nucleic acid sequence included bp 1-420 from OspA- 
B31, followed by 420-570 from OspA-K48, followed by bp 570- 
650 from OspA-B31, followed by bp 651-820 from OspA-K48. 
Primers used included: the fusion primer, 5'- 
CAAGTCTGGTTCCAATTTGCTCTTGTTATTAT-3 ' (minus strand primer 

20 #436-420) (SEQ ID NO. 99) ; and the sequence primer, #357 
(SEQ ID NO. 15) . The chimeric nucleic acid sequence is 
presented as SEQ ID NO. 100; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
ID NO. 101. 

25 OspA-B31/OspB-B31 A chimer of OspA and OspB from strain 
B31 (OspA-B31, OspB-B31) was generated using the method 
described above. The chimeric nucleic acid sequence 
included bp 1-651 from OspA-B31, followed by bp 652-820 
from OspB-B31. Primers used included: the fusion primer, 

3 0 5 ' - GTTAAAGTGCTAGTACTGTCATTCCAAGCTGCAGTTTTTTT - 3 ' (minus 
strand primer #740-651) (SEQ ID NO. 102) ; the carboxy- 
terminal sequence of OspB primer #1106 (SEQ ID NO. 25) ; and 
the sequence primer #357 (SEQ ID NO. 15) . The chimeric 
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nucleic acid sequence is presented as SEQ ID NO. 103; the 
chimeric protein encoded by this chimeric nucleic acid 
sequence is presented as SEQ ID NO. 104. 

OsnA-BSl/Ogp"-^-' /QsnC-B31 * A chimer of OspA, OspB and 
5 OspC from strain B31 (0spA-B31, OspB r B31, and OspC-B31) was 
generated using the method described above. The chimeric 
nucleic acid sequence included bp 1-650 from 0spA-B31, 
followed by bp 652-820 from OspB-B31, followed by bp 74-630 
of OspC-B31. Primers used included: the fusion primer, 5'- 
10 TGCAGATGTAATCCCATCCGCCATTTTTAAAGCGTTTTT - 3 ' * (SEQ ID NO. 

105) ; and the carboxy- terminal sequence of OspC primer (SEQ 
ID NO. 28) . The chimeric nucleic acid sequence is 
presented as SEQ ID NO. 106; the chimeric protein encoded 
by this chimeric nucleic acid sequence is presented as SEQ 
15 ID NO. 107. 

o sp C-B31/Ofp^-t^i /OSPB-B31 A chimer of OspA, OspB and _ 

OspC from strain B31 (OspA-B31, OspB-B31, and OspC-B31) was 
generated using the method described above. The chimeric 
20 nucleic acid sequence included bp 1-630 from OspC-B31, 

followed by bp 52-650 from OspA-B31, followed by bp 650-820 
of OspB-B31. Primers used included: the amino- terminal 
sequence of OspC primer having SEQ ID NO. 27; the fusion 
primer , 5 ' -GCTGCTAACATTTTGCTTAGGTTTTTTTGGACTTTC- 3 ' (minus 
25 strand primer #69-630) (SEQ ID NO. 108); and the sequence 
primers #520 (SEQ ID NO. 40) and #200 (SEQ ID NO. 18). The 
chimeric nucleic acid sequence is presented as SEQ ID NO. 
109; the chimeric protein encoded by this chimeric nucleic 
acid sequence is presented as SEQ ID NO. 110. 

3 0 Additional Chimeric Nucle i r. Acid Sequences 

Using the methods described above, other chimeric 
nucleic acid sequences were produced. These chimeric 
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nucleic acid sequences, and the proteins encoded, 
summarized in Table 3 . 



are 



Table III Chimeric Nucleic acid Sequences and the Encoded 
Proteins 



Chimers Generated (base pairs) 


SSQ ID 
NO. (nt) 


SEQ ID NO. 
(protein) - 


OspA (52-882) / p93 (1168-2100) 


111 


112 


OspB (45-891) / p41 (122-234) 


113 


114 


OspB (45-891) / p41 (122-295) 


115 


116 


OspB (45-891) / p41 (140-234) 


117 


118 


OspB (45-891) / p41 (140-295) 


119 


120 


OspB (45-891) / p41 (122-234) / 
OspC (58-633) 


121 


122 


OspA-Tro/OspA-Bo 


137 


138 


OspA-PGau/OspA-Bo 


139 


140 


OspA-B3l/OspA-PGau/OspA-B3l/ 
OspA-K48 


141 


142 


OspA-PGau/OspA-B3l/OspA-K48 


143 


144 
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C. Purification of Proteins Generated bv Chime ric Nucleic 
Acid Sequences 

The chimeric nucleic acid sequences described above, 
as well as chimeric nucleic acid sequences produced by the 
methods described above, are used to produce chimeric 
proteins encoded by the nucleic acid sequences. Standard 
methods, such as those described above in Example 3, 
concerning the expression of proteins from Borrelia genes, 
can be used to express the proteins in a compatible host 
organism. The chimeric proteins can then be isolated and 
purified using standard techniques. 

If the chimeric protein is soluble, it can be purified 
on a Sepharose column. Insoluble proteins can be 
solubilized in guanidine and purified on a Ni + + column; 
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altematively, they can be solubilized in 10 mM NaP0 4 with 
0.1 - 1% TRIXON X 114, and subsequently purified over an S 
column (Pharmacia) . Lipidated proteins were generally 
purified by the latter method. Solubility was determined 
by separating both soluble and insoluble fractions of cell 
lysate on a 12% PAGE gel, and checking for the localization 
of the protein by Coomasie staining, or by Western blotting 
with monoclonal antibodies directed to an antigenic 
polypeptide of the chimeric protein. 



10 Equivalents 

Those skilled in the art will recognize, or be able to 
ascertain using no more than routine experimentation, many 
equivalents to the specific embodiments of the invention 
described herein, such equivalents are intended to be 

15 encompassed in the scope of the following claims. 
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CLAIMS 

What is claimed is: 

1. A chimeric protein comprising two or more antigenic 
Borrelia polypeptides, wherein ;the antigenic Borrelia 

5 polypeptides which comprise the chimeric protein do 

not occur naturally in the same protein in Borrelia. 

2. The chimeric protein of Claim 1, wherein the antigenic 
Borrelia polypeptides are from two or Tnore different 
species of Borrelia. 

10 3. The chimeric protein of Claim 2, wherein the antigenic 
Borrelia polypeptides are derived from Borrelia 
proteins selected from the group consisting of: outer 
surface protein A, outer surface protein B, outer 
surface protein C f outer surface protein D, pl2, p3 9 , 

15 p41 r p66, and p93. 

4. The chimeric protein of Claim 3, wherein the antigenic 
Borrelia polypeptides are from corresponding proteins 
from two or more different species of Borrelia. 

5. The chimeric protein of Claim 3, wherein the antigenic 
20 Borrelia polypeptides are from non- corresponding 

proteins from at least two different species of 
Borrelia . 



25 



6. 



The chimeric protein of Claim 1, wherein two or nv 
antigenic Borrelia polypeptides are from the same 
species of Borrelia. 
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9. 

10 

10. 



15 



20 11. 



25 



The chimeric protein of Claim 6, wherein the antigenic 
Borrelia polypeptides are derived from Borrelia 
proteins selected from the group consisting of: outer 
surface protein A, outer surface protein B, outer 
surface protein C, outer surface protein D, P 12, P 39, 
p41, p66, and p93 - 

The chimeric protein of Claim 7, wherein the antigenic 
Borrelia polypeptides are from the same protein. 

The chimeric protein of Claim 6, wherein the antigenic 
Borrelia polypeptides are from different proteins. 

A chimeric protein comprising two antigenic Borrelia 
polypeptides flanking a tryptophan residue, wherein 
the amino-proximal polypeptide consists of a 
polypeptide that is proximal from the single 
tryptophan residue of a first outer surface protein of 
Borrelia, and the carboxy-proximal polypeptide 
consists of a polypeptide that is distal from the 
single tryptophan residue of a second outer surface 
protein of Borrelia. 

The chimeric protein of Claim 10, wherein the first 
and second outer surface proteins are from the same 

species of Borrelia. 

The chimeric protein of Claim 11, wherein the first 
outer surface protein is outer surface protexn A and 
the second outer surface protein is outer surface 
protein B. 

The chimeric protein of Claim 11, wherein the first 
outer surface protein is outer surface protein B. and 
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the second outer surface protein is outer surface 
protein A. 

14. The chimeric protein of Claim 10 , wherein the first 
and second outer surface proteins are from different 

5 species of Borrelia. 

15. The chimeric protein of Claim 14, wherein the f irst 
outer surface protein is outer surface protein A and 
the second outer surface protein is outer surface 
protein B. 

10 16. The chimeric protein of Claim 14, wherein bhe first 

outer surface protein is outer surface protein B, and 
the second outer surface protein is outer surface 
protein A. 

17. The chimeric protein of Claim 14, wherein the first 
15 and second outer surface proteins are corresponding 

proteins selected from the group consisting of: outer 
surface protein A and outer surface protein B. 

18. The chimeric protein of Claim 10, wherein the first 
outer surface protein is outer surface protein A and 

20 the second outer surface protein is outer surface 

protein B. 

19. The chimeric protein of Claim 18, wherein the amino- 
proximal polypeptide further comprises a first, 
second, and third hypervariable domain, the first 

25 hypervariable domain consisting of residues 120 

through 140 of outer surface protein A, the second 
hypervariable domain consisting of residues 150 
through 180 of outer surface protein A, and the third 
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hypervariable domain consisting of residues 200 
through 217 of outer surface protein A. 

20. The chimeric protein of Claim 19, wherein the first 
and second hypervariable domains are derived from 

5 outer surface protein A from different species of 

Borrelia. 

21. The chimeric protein of Claim 10, further comprising 
an antigenic Borrelia polypeptide derived from a 
Borrelia protein selected from the group consisting 

10 0 f : outer surface protein A, outer surface protein B, 

outer surface protein C, outer surface protein D, pl2, 
p39, p41, p66, and p93 . 

22. A nucleic acid sequence encoding a chimeric protein 
comprising two antigenic Borrelia polypeptides, 

15 wherein the two antigenic Borrelia polypeptides which 

comprise the chimeric protein do not occur naturally 
in the same protein in Borrelia. 

23. The nucleic acid sequence of Claim 22, wherein the 
antigenic Borrelia polypeptides are from two or more 

20 different species of Borrelia. 

24. The nucleic acid sequence of Claim 23, wherein the 
antigenic Borrelia polypeptides are derived from 
Borrelia proteins selected from the group consisting 
of: outer surface protein A, outer surface protein B, 

25 outer surface protein C, outer surface protein D, pl2, 

p39, p41, p66, andp93. 

25. The nucleic acid sequence of Claim 24, wherein the 

antigenic Borrelia polypeptides are from corresponding 
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proteins from two or more different species of 
Borrelia. 

26. The nucleic acid sequence of Claim 24, wherein two or 
more of the antigenic Borrelia polypeptides are from 

5 non- corresponding proteins f rom , different species of 

Borrelia. 

27. The nucleic acid sequence of Claim 22, wherein two or 
more antigenic Borrelia polypeptides are from the same 
species of Borrelia. 

10 28. The nucleic acid sequence of Claim 27, wherein the 
antigenic Borrelia polypeptides are derived from 
Borrelia proteins selected from the group consisting 
of: outer surface protein A, outer surface protein B, 
outer surface protein C, outer surface protein D, pl2, 

15 p39, p41, p66, andp93. 

29. The nucleic acid sequence of Claim 28, wherein the 
antigenic Borrelia polypeptides are from the same 
protein. 



30. The nucleic acid sequence of Claim 27, wherein the 
20 antigenic Borrelia polypeptides are from different 

proteins . 
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31. A nucleic acid sequence encoding a chimeric protein 
comprising two antigenic Borrelia polypeptides 
flanking a tryptophan residue, wherein the amino- 
proximal polypeptide consists of a polypeptide that is 
5 proximal from the single tryptophan residue of a first 

outer surface protein of Borrelia, and the carboxy- 
proximal polypeptide consists of a polypeptide that is 
distal from the single tryptophan residue of a second 
outer surface protein of Borrelia. 

10 32. The nucleic acid sequence of Claim 31,' wherein the 

first and second outer surface proteins are from the 
same species of Borrelia. 

33. The nucleic acid sequence of Claim 32, wherein the 
first outer surface protein is outer surface protein A 

15 and the second outer surface protein is outer surface 

protein B. 

34. The nucleic acid sequence of Claim 32, wherein the 
first outer surface protein is outer surface protein 
B, and the second outer surface protein is outer 

20 surface protein A. 

35. The nucleic acid sequence of Claim 31, wherein the 
first and second outer surface proteins are from 
different species of Borrelia. 

36. The nucleic acid sequence of Claim 35, wherein the 

25 first outer surface protein is outer surface protein A 

and the second outer surface protein is outer surface 
protein B. 
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37. The nucleic acid sequence of Claim 35, wherein the 

first outer surface protein is outer surface protein 
B, and the second outer surface protein is outer 
surface protein A. 

5 38. The nucleic acid sequence of Cl^im 35, wherein the 
first and second outer surface proteins are 
corresponding proteins selected from the group 
consisting of: outer surface protein A and outer 
surface protein B. 

10 39. The nucleic acid sequence of Claim 31, wherein the 

first outer surface protein is outer surface protein A 
and the second outer surface protein is outer surface 
protein B. 

40. The nucleic acid sequence of Claim 39, wherein the 
15 amino-proximal polypeptide further comprises a first 

and a second hypervariable domain, the first 
hypervariable domain consisting of amino acid residues 
1 through 14 0 of outer surface protein A, and the 
second hypervariable domain consisting of amino acid 
20 residues 150 through 217 of outer surface protein A. 

41. The nucleic acid sequence of Claim 40, wherein the 
first and second hypervariable domains are derived 
from outer surface protein A from different species of 
Borrelia. 

25 42. The nucleic acid sequence of Claim 31, further 

comprising an antigenic Borrelia polypeptide derived 
from a Borrelia protein selected from the group 
consisting of: outer surface protein A, outer surface 
protein B, outer surface protein C # outer surface 

30 protein D, p!2, p39, p41, p66, and p93 . 
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43. A nucleic acid sequence having a sequence selected 
from the group consisting of: SEQ ID NO. 85, SEQ ID 
NO. 88, SEQ ID NO. 91, SEQ ID NO. 94, SEQ ID NO. 97, 
SEQ ID NO. 100, SEQ ID NO. 103, SEQ ID NO. 106, SEQ ID 
NO. 109, SEQ ID NO. Ill, SEQ ID NO. 113, SEQ ID NO. 
115, SEQ ID NO. 117, SEQ ID NO. < 119, SEQ ID NO. 121, . 
SEQ ID NO. 137, SEQ ID NO. 139, SEQ ID NO. 141, and 
SEQ ID NO. 143. 

44. A protein having an amino acid sequence selected from 
the group consisting of: SEQ ID NO. 8*6, SEQ ID NO. 
89, SEQ ID NO. 92, SEQ ID NO. 95, SEQ ID NO. 98, SEQ 
ID NO. 101, SEQ ID NO. 104, SEQ ID NO. 107, SEQ ID NO. 
110, SEQ ID NO. 112, SEQ ID NO. 114, SEQ ID NO. 116, 
SEQ ID NO. 118, SEQ ID NO. 120, SEQ ID NO. 122, SEQ ID 
NO. 138, SEQ ID NO. 140, SEQ ID NO. 142, and SEQ ID 
NO. 144. 



45. A chimeric protein according to any one of claims 1 to 
21 and 44 for use in therapy or diagnosis, for example 
as a vaccine against Borrelia infection, in 

20 immunodiagnostic assays to detect the presence of 

antibodies to Borrelia or to measure T-cell 
reactivity. 

46. A chimeric protein according to claim 45, wherein the 
immunodiagnostic assay is a dot blot, Western blot, 

25 ELISA or agglutination assay. 
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47. Use of the chimeric protein according to any one of 

claims 1 to 21 and 44, or the nucleic acid sequence of 
any one of claims 22 to 43, for the manufacture of a 
compound for use in therapy or diagnosis, for example 
5 as a vaccine against Borrelia infection, in 

immunodiagnostic assays to detect the presence of 
antibodies to Borrelia or to measure T-cell . 
reactivity. 



48. 

10 



Use according to claim 47, wherein the 
immunodiagnostic assay is a dot blot, Western blot, 
ELISA or agglutination assay. 
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ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 48 
Met Lye Lys Tyr Leu Leu Gly He Gly Leu He Leu Ala Leu lie Ala 
15 10 15 

TGT AAG CAA AAT GTT AGC AGC CTT GAC GAG AAA AAC AGC GTT TCA GTA 96 
Cys Lye Gin Aen Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val 
20 25 30 

GAT TTG CCT GGT GAA ATG AAA GTT CTT GTA AGC AAA GAA AAA AAC AAA 144 
Asp Leu Pro Gly Glu Met Lys Val Leu Val Ser Lys* Glu Lys Asn Lys 
35 40 45 

GAC GGC AAG TAC GAT CTA ATT GCA ACA GTA GAC AAG CTT GAG CTT AAA 192 
Asp Gly Lys Tyr Asp Leu He Ala Thr Val Asp Lys Leu Glu Leu Lys 
50 55 60 

GGA ACT TCT GAT AAA AAC AAT GGA TCT GGA GTA CTT GAA GGC GTA AAA 240 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Val Leu Glu Gly Val Lys 
65 70 75 80 

GCT GAC AAA AGT AAA GTA AAA TTA ACA ATT TCT GAC GAT CTA GGT CAA 288 
Ala Asp Lys Ser Lys Val Lys Leu Thr He Ser Asp Asp Leu Gly Gin 
85 90 95 

ACC ACA CTT GAA GTT TTC AAA GAA GAT GGC AAA ACA CTA GTA TCA AAA 336 
Thr Thr Leu Glu Val Phe Lys Glu Asp Gly Lys Thr Leu Val Ser Lys 
100 105 HO 

AAA GTA ACT TCC AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAT GAA 384 
Lys Val Thr Ser Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu 
115 120 125 

AAA GGT GAA GTA TCT GAA AAA ATA ATA ACA AGA GCA GAC GGA ACC AG A 432 
Lys Gly Glu Val Ser Glu Lys He He Thr Arg Ala Asp Gly Thr Arg 
y 130 135 140 

CTT GAA TAC ACA GGA ATT AAA AGC GAT GGA TCT GGA AAA GCT AAA GAG 480 
Leu Glu Tyr Thr Gly He Lys Ser Asp Gly Ser Gly Lys Ala Lys Glu 
145 150 155 160 

GTT TTA AAA GGC TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA ACA 528 
Val Leu Lys Gly Tyr Val Leu Glu Gly Thr Leu Thr Ala Glu Lys Thr 
165 170 175 

ACA TTG GTG GTT AAA GAA GGA ACT GTT ACT TTA AGC AAA AAT ATT TCA 576 
Thr Leu Val Val Lys Glu Gly Thr Val Thr Leu Ser Lys Asn He Ser 
180 185 190 

AAA TCT GGG GAA GTT TCA GTT GAA CTT AAT GAC ACT GAC AGT AGT GCT 624 
Lys Ser Gly Glu Val Ser Val Glu Leu Asn Asp Thr Asp Ser Ser Ala 
195 200 205 
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GCT ACT AAA AAA ACT GCA GCT TGG AAT TCA GGC ACT TCA ACT TTA ACA 672 
Ala Thr LyG Lys Thr Ala Ala Trp Asn Ser Gly Thr Ser Thr Leu Thr 
210 215 220 

ATT ACT GTA AAC AGT AAA AAA ACT AAA GAC CTT GTG TTT ACA AAA GAA * 720 
lie Thr Val Aen Ser Lye Lys Thr Lys Asp Leu Val Phe Thr Lys Glu 
225 230 235 240 

AAC ACA ATT ACA GTA CAA CAA TAC GAC TCA AAT GGC ACC AAA TTA GAG 768 
Aen Thr lie Thr Val Gin Gin Tyr Asp Ser Asn Gly Thr Lys Leu Glu 
245 250 2S5 

GGG TCA GCA GTT GAA ATT ACA AAA CTT GAT GAA ATT AAA AAC GCT TTA 816 
Gly Ser Ala Val Glu lie Thr Lys Leu Asp Glu lie Lys Asn Ala Leu 
260 265 270 

AAA TA 822 
Lys 
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ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 
TAC TTT TTT ATA AAT AAC CCT TAT CCA GAT TAT AAT CGG AAT TAT CGT 
Met Lys Lys Tyr Leu Leu Gly lie Gly Leu He Leu Ala Leu He Ala> 



50 



60 70 80 90 



TGT AAG CAA AAT GTT AGC AGC CTT GAT GAA AAA AAT AGC GTT TCA GTA 
ACA TTC GTT TTA CAA TCG TCG GAA CTA CTT TTT TTA TCG CAA AGT CAT 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys^Asn Ser Val Ser Val> 

100 110 120 130 140 

* * * **• * * * 

GAT TTA CCT GGT GGA ATG ACA GTT CTT GTA AGT AAA GAA AAA GAC AAA 
CTA AAT GGA CCA CCT TAC TGT CAA GAA CAT TCA TTT CTT TTT CTG TTT 
Asp Leu Pro Gly Gly Met Thr Val Leu Val Ser Lys Glu Lys Asp Lys> 

!50 160 170 180 ISO 

GAC GGT AAA TAC AGT CTA GAG GCA ACA GTA GAC AAG CTT GAG CTT AAA 
CTG CCA TTT ATG TCA GAT CTC CGT TGT CAT CTG TTC GAA CTC GAA TTT 
Asp Gly Lys Tyr Ser Leu Glu Ala Thr Val Asp Lys Leu Glu Leu Lys> 

200 210 220 230 240 

GGA ACT TCT GAT AAA AAC AAC GGT TCT GGA ACA CTT GAA GGT GAA AAA 
CCT TGA AGA CTA TTT TTG TTG CCA AGA CCT TGT GAA CTT CCA CTT TTT 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Thr Leu Glu Gly Glu Lys> 

250 260 270 280 

« « * *«* * « * 

ACT GAC AAA AGT AAA GTA AAA TTA ACA ATT GCT GAT GAC CTA AGT CAA 
TGA CTG TTT TCA TTT CAT TTT AAT TGT TAA CGA CTA CTG GAT TCA GTT 
Thr Asp Lys Ser Lys Val Lys Leu Thr He Ala Asp Asp Leu Ser Gln> 

290 300 310 320 330 

« * * * *** 

ACT AAA TTT GAA ATT TTC AAA GAA GAT GCC AAA ACA TTA GTA TCA AAA 
TGA TTT AAA CTT TAA AAG TTT CTT CTA CGG TTT TGT AAT CAT AGT TTT 
Thr Lys Phe Glu He Phe Lys Glu Asp Ala Lys Thr Leu Val Ser Lys> 

340 350 360 370 380 

AAA GTA ACC CTT AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAC GAA 
TTT CAT TGG GAA TTT CTG TTC AGT AGT TGT CTT CTT TTT AAG- TTG CTT 
Lys Val Thr Leu Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu> 
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He 


GCA 
CGT 
Ala> 


CA 

• 


« 


60 
* 




* 


70 
* 


• 




60 
• 




• 


90 
• 




* 


TGC AAG 
ACG TTC 
Cys Lys 


CAA 
GTT 
Gin 


AAT 
TTA 
Asn 


GTT 
CAA 
Val 


AGC 
TCG 
Ser 


AGC 
TCG 
Ser 


CTT 
GAA 
Leu 


GAT 
CTA 
Asp 


GAA 
CTT 
Glu 


AAA- AAC 
TTT TTG 
Lys Asn 


AGC 
TCG 
Ser 


GCT 
CGA 
Ala 


TCA 
AGT 
Ser 


GTA 
CAT 
Val> 


1UU 
* 




110 

«r 




* 


120 
• 




* 


130 
* 


♦ 


140 
* 




GAT TTG 
CTA AAC 

Asp Leu 


CCT 
GGA 
Pro 


GGT GAG 
CCA CTC 
Gly Glu 


ATG 
TAC 
Met 


AAA 
TTT 
Lys 


GTT 
CAA 
Val 


CTT 
CK\ 
Leu 


GTA 
CAT 
Val 


AGT AAA 
TCA TTT 
Ser Lys 


GAA 
CTT 
Glu 


AAA 
TTT 
Lys 


GAC 
CTG 
Asp 


AAA 
TTT 
Lys> 


150 
* * 




« 


160 
• 




170 




• 


160 
« 




« 


190 
* 


GAC GGT 
CTG CCA 
Asp Gly 


AAG 
TTC 
Lys 


TAC 
ATG 
Tyr 


AGT 
TCA 
Ser 


CTA 
GAT 
Leu 


AAG 
TTC 
Lys 


GCA 
CGT 
Ala 


ACA 
TGT 
Thr 


GTA 
CAT 
Val 


GAC 
CTG 
Asp 


AAG 
TTC 

Lys 


ATT 
TAA 
He 


GAG 
CTC 
Glu 


CTA 
GAT 
Leu 


AAA 
TTT 
Lys> 


200 




* 


210 




• 


220 


• 


230 




* 


24C 


GGA ACT 
CCT TGA 
Gly Thr 


TCT 
AGA 
Ser 


GAT 
CTA 
Asp 


AAA 
TTT 
Lys 


GAC 
CTG 
Asp 


AAT GGT 
TTA CCA 
Asn Gly 


TCT 
AGA 
Ser 


GGA GTG 
CCT CAC 
Gly Val 


CTT 
GAA 
Leu 


GAA 
CTT 
Glu 


GGT ACA 
CCA TGT 
Gly Thr 


AAA 
TTT 
Lys> 


* 


250 


• 


260 
* 




• 


270* 
• 






• 280 
* 


* 




GAT GAC 
CTA CTG 
Asp Asp 


AAA 
TTT 
Lys 


AGT 
TCA 
Ser 


AAA 
TTT 
Lys 


GCA 
CGT 
Ala 


AAA 
TTT 
Lys 


TTA 
AAT 
Leu 


ACA 
TGT 
Thr 


ATT CCT GAC 
TAA CGA CTG 
He Ala Asp 


GAT 
CTA 
Asp 


CTA AGT 
GAT TCA 
Leu Ser 


AAA 
TTT 
Lys> 


290 
• 


• 


300 
* 




* 


310 
* 


* 


320 
* 




• 


330 
• 




* 


ACC ACA 
TGG TGT 
Thr Thr 


TTC 
AAG 
Phe 


GAA 
CTT 
Glu 


CTT 
GAA 
Leu 


TTA 
AAT 
Leu 


AAA 
TTT 
Lys 


GAA GAT 
CTT CTA 
Glu Asp 


GGC 
CCG 
Gly 


AAA ACA 
TTT TGT 
Lys Thr 


TTA 
AAT 
Leu 


GTG 
CAC 
Val 


TCA 
AGT 
Ser 


AGA 
TCT 
Arg> 


340 
* 


• 




350 




• 


360 
* 




« 


370 
• 






380 
« 





AAA GTA AGT TCT AGA GAC AAA ACA TCA ACA GAT GAA ATG TTC AAT GAA 
TTT CAT TCA AGA TCT CTG TTT TGT AGT TGT CTA CTT TAC AAG TTA CTT 
Lys Val Ser Ser Arg Asp Lys Thr Ser Thr Asp Glu Met Phe Asn Glv> 
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390 



400 410 420 420 



AAA GGT GAA TTG TCT GCA AAA ACC ATG ACA AGA GAA AAT GGA ACC AAA 
TTT CCA CTT AAC AGA CGT TTT TGG TAC TCT TCT CTT TTA CCT TGG TTT 
Lys Gly Glu Leu Ser Ala Lys Thr Met Thr Arg Glu Asn Gly Thr Lys> 

440 450 460 470 480 

CTT GAA TAT ACA GAA ATG AAA AGC GAT GGA ACC GGA AAA CCT AAA GAA 
GAA CTT ATA TGT CTT TAC TTT TCG CTA C£T TgV CCT TTT CGA TTT CTT 
Leu Glu Tyr Thr Glu Met Lys Ser Asp Gly Thr Gly Lys Ala Lys Glu> 

490 500 510 520 

GTT TTA AAA AAG TTT ACT CTT GAA GGA AAA GTA GCT AAT GAT AAA GTA 
CAA AAT TTT TTC AAA TGA GAA CTT CCT TTT CAT CGA TTA CTA TTT CAT 
Val Leu Lys Lys Phe Thr Leu Glu Gly Lys Val Ala Asn Asp Lys Val> 



530 



540 550 S60 570 



ACA TTG GAA GTA AAA- GAA GGA ACC GTT ACT TTA AGT AAG GAA ATT GCA 
TGT AAC CTT CAT TTT CTT CCT TGG CAA TGA AAT TCA TTC CTT TAA CGT 
Thr Leu Glu Val Lys Glu Gly Thr Val Thr Leu Ser Lys Glu lie Ala> 



S80 



590 600 610 620 



AAA TCT GGA GAA GTA ACA GTT GCT CTT AAT GAC ACT AAC ACT ACT CAG 
TTT AGA CCT CTT CAT TGT CAA CGA GAA TTA CTG TGA TTG TGA TGA GTC 
Lys Ser Gly Glu Val Thr Val Ala Leu Asn Asp Thr Asn Thr Thr Gln> 



630 



640 650 660 670 



GCT ACT AAA AAA ACT GGC GCA TGG GAT TCA AAA ACT TCT ACT TTA ACA 
CGA TGA TTT TTT TGA CCG CGT ACC CTA AGT TTT TGA AGA TGA AAT TGT 
Ala Thr Lys Lys Thr Gly Ala Trp Asp Ser Lys Thr Ser Thr Leu Thr> 

680-** 690 700 710 720 

ATT AGT GTT AAC AGC AAA AAA ACT ACA CAA CTT GTG TTT ACT AAA CAA 
TAA TCA CAA TTG TCG TTT TTT TGA TGT GTT GAA CAC AAA TGA TTT GTT 
lie Ser Val Asn Ser Lys Lys Thr Thr Gin Leu Val Phe Thr Lys Gln> 

730 740 750 760 

• ••••• 

TAC ACA ATA ACT GTA AAA CAA TAC GAC TCC GCA GGT ACC AAT TTA GAA 
ATG TGT TAT TGA CAT TTT GTT ATG CTG AGG CGT CCA TGG TTA AAT CTT. 
Tyr Thr lie Thr Val Lys Gin Tyr Asp Ser Ala Gly Thr Asn Leu Glu> 
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OSP A PGAU 



,70 780 ^ 790 ^ 800 ^ 810 

GGC ACA GCA GTC GAA ATT AAA ACA CTT GAT GAA CTT AAA AAC GCT TTA 
CCG TGT CGT CAG CTT TAA TTT TGT GAA CTA CTT GAA TTT TTG CGA AAT 
Gly Thr Ala Val Glu He Lys Thr Leu Asp Glu Leu Lys Asn Ala Leu> 



820 

* 

AAA TAA 
TTT ATT 

Lys 
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WO 95/12676 



15/133 



PCT/US94/I2352 



ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCT TTA ATA GCA 48 
Met Lys Lye Tyr Leu Leu Gly lie Gly Leu lie Leu Ala Leu He Ala 
15 10 15 

TGT AAG CAA AAT GTT AGO AGC CTT GAC GAG AAA AAC AGC GTT TCA GTA 96 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val 
20 25 30 

GAT TTG CCT GGT GAA ATG AAA GTT CTT GTA AGC AAA GAA AAA GAC AAA 144 
Asp Leu Pro Gly Glu Met Lys Val Leu Val Ser Lys Glu Lys Asp Lys 
35 40 45 

GAC GGC AAG TAC AGT CTA ATG GCA ACA GTA GAC AAG CTT GAG CTT AAA 192 
Asp Gly Lys Tyr Ser Leu Met Ala Thr Val Asp Lys Leu Glu Leu Lys 
50 55 60 



Figure 10 (1 of 2) 
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16/133 y 



GGA ACA TCT GAT AAA AAC AAT GGA TCT GGG GTG CTT GAA GGC GTA AAA 240 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Val Leu Glu Gly Val Lys 
65 70 75 80 

GCT GAG AAA AGC AAA GTA AAA TTA ACA GTT TCT GAC GAT CTA AGC ACA 288 
Ala Asp Lys Ser Lys Val Lys Leu Thr Val Ser Asp Asp Leu Ser Thr 
85 90 95 

ACC ACA CTT GAA GTT TTA AAA GAA GAT GGC AAA ACA TTA GTG TCA AAA 336 
Thr Thr Leu Glu Val Leu Lys Glu Asp Gly Lys Thr Leu Val Ser Lys 
100 105 HO 

AAA AGA ACT TCT AAA GAT AAG TCA TCA ACA GAA GAA AAG TTC AAT GAA 384 
Lys Arg Thr Ser Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu 
IIS 120 125 

AAA GGC GAA TTA GTT GAA AAA ATA ATG GCA AGA GCA AAC GGA ACC ATA 432 
Lys Gly Glu Leu Val Glu Lys lie Met Ala Arg Ala Asn Gly Thr lie 
130 135 140 

CTT GAA TAG ACA GGA ATT AAA AGC GAT GGA TCC GGA AAA GCT AAA GAA 480 
Leu Glu Tyr Thr Gly He. Lys Ser Asp Gly Ser Gly Lys Ala Lys Glu 
145 150 155 160 

ACT TTA AAA GAA TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA GCA 528 
Thr Leu Lys Glu Tyr Val Leu Glu Gly Thr Leu Thr Ala Glu Lys Ala 
165 170 175 

ACA TTG GTG GTT AAA GAA GGA ACT GTT ACT TTA AGT AAG CAC ATT TCA 576 
Thr Leu Val Val Lys Glu Gly Thr Val Thr Leu Ser Lys His He Ser 
180 185 190 

AAA TCT GGA GAA GTA ACA GCT GAA CTT AAT GAC ACT GAC AGT ACT CAA 624 
Lys Ser Gly Glu Val Thr Ala Glu Leu Asn Asp Thr Asp Ser Thr Gin 
195 200 205 

GCT ACT AAA AAA ACT GGG AAA TGG GAT GCA GGC ACT TCA ACT TTA ACA 672 
Ala Thr Lys Lys Thr Gly Lys Trp Asp Ala Gly Thr Ser Thr Leu Thr 
210 215 220 

ATT ACT GTA AAC AAC AAA AAA ACT AAA GCC CTT GTA TTT ACA AAA CAA 720 
He Thr Val Asn Asn Lys Lys Thr Lys Ala Leu Val Phe Thr Lys Gin 
225 230 235 240 

GAC ACA ATT ACA TCA CAA AAA TAC GAC TCA GCA GGA ACC AAC TTG GAA 768 
Asp Thr He Thr Ser Gin Lys Tyr Asp Ser Ala Gly Thr Asn Leu Glu 
245 250 255 

GGC ACA GCA GTC GAA ATT AAA ACA CTT GAT GAA CTT AAA AAC GCT TTA 816 
Gly Thr Ala Val Glu He Lys Thr Leu Asp Glu Leu Lys Asn Ala Leu 
260 265 270 



AGA 
Arg 



819 
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OS? . 'J/** 
Se<r-:ence^W>ge: 1 to 891 



10 



20 30 



, Tr ' c ~ „ A T TA ATA GGA TIT GCT TTA GCG TTA GC7 TTA ATA GGA 7GT 
ilc T^T Si AAT TAT CCT AAA CGA AAT CGC AAT CGA AAT TAT CCT ACA 
£ Irg lZ lZ Ue Gly Phe Ala Leu Ala Leu Ala Leu lie Gly Cys> 



50 



60 ™ 80 90 



' CAA ^ GGT GCT GAG TCA ATT GGT TCT CAA AAA GAA AAT GAT CTA 
CGT GTT TTT CCA CGA CTC AGT TAA CCA AGA GTT TTT CTT TTA CTA GAT 
SI Gin Lys Gly Ala C-lu Ser lie Gly Ser Gin Lys Glu «sn Asp .eu> 



100 no 



120 130 1<« 



AAC CTT GAA GAC TCT AGT AAA AAA TCA CAT CAA AAC GCT AAA CAA GAC 
£ G SI CTt" CTG AGA TCA TTT TTT AGT GTA GTT TTG CGA TTT GTT CTG 
™ 1 eu Glu Asp Ser Ser Lys Lys Ser His Gin Asa Ala Lys Gin Asp> 



150 l fi0 



170 ISO ISO 



CTT CCT GCG GTG ACA GAA GAC TCA GTG TCT TTG TTT AAT GGT AAT ~AA 
GAA GGA CGC CAC TGT CTT CTG AGT CAC AGA AAC AAA TTA CCA TT« m 
Let llo Ala Sal Thr Glu Asp Ser Val Ser Leu Phe Asr. Gly Asn Lys> 

200 210 220 220 240 

1_ 

_ ^ GTA AGC AAA GAA AAA AAT AGC TCC GGC AAA TAT GAT , ;A AG« 
TAA AAA CAT TCG TTT CTT TTT TTA TCG AGG CCG TTT ATA CTA AAT TC, 
ne Phe Val Ser Lys Glu Lys Asn Ser Ser Gly Lys Tyr Asp Leu Ar?> 



250 



260 270 230 



GCA ACA ATT GAT CAG GTT GAA CTT AAA GGA ACT TCC GAT AAA ™C AA. 
S TGT TAA CTA GTC CAA CTT GAA TTT CCT TCA AGG CTA TTT TTG TT« 
Ala Thr lie Asp Gin Val Glu Leu Lys Gly Thr Ser As? Lys Asn Asn> 

290 300 310 320 330 

• * * * 

rrr TCT GGA ACC CTT GAA GGT TCA AAG CCT GAC AAG AGT AAA GTA AAA 
CCA AGA CCT TGG GAA CTT CCA AGT TTC GGA CTG TTC TCA TTT CAT TTT 
Gly Ser Gly Thr Leu Glu Gly Ser Lys Pro Asp Lys Ser Lys Vol Lys> 



340 350 



360 370 36C 



TTA ACA GTT TCT GCT GAT TTA AAC ACA GTA ACC TTA GAA GCA TTT GA, 
Hi TGT CAA AGA CGA CTA AAT TTG TGT CAT TGG AAT CTT CGT AAA CTA 
Leu Thr val Ser Ala Asp Leu Asn Thr Val Thr Leu Glu Ala Phe Asp* 
390 «00 410 420 ^ 430 
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GCC AGC AAC CAA AAA ATT TCA ACT AAA GTT ACT AAA AAA CAG GGG TCA 
CGG TCG TTG GTT TTT TAA AGT TCA TTT CAA TCA TTT TTT G7C CCC ACT 
Ala Ser Asn Gin Lys He Ser Ser Lys Val Thr Lys Lys* Gin Gly Ser> 

440 450 460 470 <£0 

ATA ACA GAG GAA ACT CTC AAA GCT AAT AAA TTA GAC TCA AAG AAA TTA 
TAT TGT CTC CTT TGA GAG TTT CGA TTA TTT AAT CTG AGT TTC TTT AAT 
He Thr Glu Glu Thr Leu Lys Ala Asn Lys Leu Asp Ser Lys Lys Leu> 

490 500 510 520 
m « * • * « * * « * 

ACA AGA TCA AAC GGA ACT ACA CTT GAA TAC TCA CAA ATA ACA GAT GCT 
TGT TCT AGT TTG CCT TGA TGT GAA CTT ATG AGT GTT TAT TGT CTA CGA 
Thr Arg Ser Asn Gly Thr Thr Leu Glu Tyr Ser Gin He Thr Asp Ala> 

530 540 550 560 570 

» «« * • • * ** • 

GAC AAT GCT ACA AAA GCA GTA GAA ACT CTA AAA AAT AGC ATT AAG CTT 
CTG TTA CGA TGT TTT CGT CAT CTT TGA GAT TTT TTA TCG TAA TTC GAA 
Asp Asn Ala Thr Lys Ala Val Glu Thr Leu Lys Asn Ser He Lys Leu> 

580 590 600 610 620 

GAA GGA AGT CTT GTA GTC GGA AAA ACA ACA GTG GAA ATT AAA GAA GGT 
CTT CCT TCA GAA CAT CAG CCT TTT TGT TGT CAC CTT TAA TTT CTT CCA 
Glu Gly Ser Leu Val Val Gly Lys Thr Thr Val Glu He Lys Gl- GIy> 

630 640 650 660 67G 

„ » « • • * * * * * 

ACT GTT ACT CTA AAA AGA GAA ATT GAA. AAA GAT GGA AAA GTA AAA GTC 
TGA CAA TGA GAT TTT TCT CTT TAA CTT TTT CTA CCT TTT CAT TTT CAG 
Thr Val Thr Leu Lys Arg Glu He Glu Lys Asp Gly Lys Val Lys Val> 

680 690 700 710 720 

« ♦ * * * • * • * * 

TTT TTG AAT GAC ACT GCA GGT TCT AAC AAA AAA ACA GGT AAA TGG GAA 
AAA AAC TTA CTG TGA CGT CCA AGA TTG TTT TTT TGT CCA TTT ACC CTT 
Phe Leu Asn Asp Thr Ala Gly Ser Asn Lys Lys Thr Gly Lys Trp Glu> 

730 740 750 760 

« «• * *-* • 

GAC AGT ACT AGC ACT TTA ACA ATT AGT GCT GAC AGC AAA AAA ACT AAA 
CTG TCA TGA TCG TGA AAT TGT TAA TCA CGA CTG TCG TTT TTT TGA TTT 
Asp Ser Thr Ser Thr Leu Thr He Ser Ala Asp Ser Lys Lys Thr Lys> 

770 780 790 800 810 

. «* •••• 

GAT TTG GTG TTC TTA ACA GAT GGT ACA ATT ACA GTA CAA CA* TAC AAC 
CTA AAC CAC AAG AAT TGT CTA CCA TGT TAA TGT CAT GTT GTT ATG TTG 
Asp Leu Val Phe Leu Thr Asp Gly Thr He Thr Val Gin Glr. Tyr Asn> 
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820 830 8^0 

ACA GCT GGA ACC AGC CTA GAA GGA 
TGT CGA CCT TGG TCG GAT CTT CCT 
Thr Ala Gly Thr Ser Leu Glu Gly 



850 860 
« • ♦ * 

TCA GCA AGT GAA ATT AAA AAT CTT 
AGT CGT TCA CTT TAA TTT TTA GAA 
Ser Ala Ser Glu lie Lys Asn Leu> 



870 
* * 

TCA GAG CTT 
AGT CTC GAA 
Ser Glu Leu 



880 
* • 

AAA AAC GCT 
TTT TTG CGA 
Lys Asn Ala 



890 
* * 

TTA AAA TAA 
AAT TTT ATT 
Leu Lys **•> 
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OepC-B31 

Sequence Range: 1 to 633 



10 20 



30 40 



ATG AAA AAG AAT ACA TTA ACT GCG ATA TTA ATG ACT TTA TTT TTA TTT 
£H TTT TTC TTA TGT AAT TCA CGC TAT AAT TAG TGA AAT AAA AAT AAA 
Met Lys Lys Asn Thr Leu Ser Ala lie Leu Met Thr Leu Phe Leu Phe> 



SO 60 



70 80 90 



* 



ATA TCT TGT AAT AAT TCA GGG AAA GAT GGG AAT ACA TCT GCA AAT TCT 
TAT AGA ACA TTA TTA AGT CCC TTT CTA CCC TTA TGT AGA CGT TTA AGA 
lie Ser Cys Asn Asn Ser Gly Lys Asp Gly AsirThr Ser Ala Asn Ser> 



100 



HO 120 130 140 



GCT GAT GAG TCT GTT AAA GGG CCT AAT CTT ACA GAA ATA AGT AAA AAA 
CGA CTA CTC AGA CAA TTT CCC GGA TTA GAA TGT CTT TAT TCA TTT TTT 
Ala Asp Glu Ser Val Lys Gly Pro Asn Leu Thr Gly lie Ser Lys Lys> 



150 



160 170 180 190 



ATT ACG GAT TCT AAT GCG GTT TTA CTT GCT GTG AAA GAG GTT GAA GCG 
TAA TGC CTA AGA TTA CGC CAA AAT GAA CGA CAC TTT CTC CAA CTT CGC 
lie Thr Asp Ser Asn Ala Val Leu Leu Ala Val Lys Glu Val Glu Ala> 



200 



210 220 230 240 



TTC CTC TCA TCT ATA GAT GAA ATT GCT GCT AAA GCT ATT GGT AAA AAA 
AAC GAC AGT AGA TAT CTA CTT TAA CGA CGA TTT CGA TAA CCA TTT TTT 
Leu Leu Ser Ser lie Asp Glu lie Ala Ala Lys Ala lie Gly Lys Lys> 



250 



260 270 280 



ATA CAC CAA AAT AAT GGT TTG GAT ACC GAA TAT AAT CAC AAT GGA TCA 
TAT GTG GTT TTA TTA CCA AAC CTA TGG CTT ATA TTA GTG TTA CCT AGT 
lie His Gin Asn Asn Gly Leu Asp Thr Glu Tyr Asn His Asn Gly Ser> 



290 



300 310 320 330 



TTG TTA GCG GGA CGT TAT GCA ATA TCA ACC CTA ATA AAA CAA AAA TTA 
AAC AAT CGC CCT GCA ATA CGT TAT AGT TGG GAT TAT TTT GTT TTT AAT 
Leu Leu Ala Gly Arg Tyr Ala He Ser Thr Leu He Lys Gin Lys Leu> 



340 



350 360 370 380 



GAT GGA TTG AAA AAT GAA GGA TTA AAG GAA AAA ATT GAT GCG GCT AAG 
CTA CCT AAC TTT TTA CTT CCT AAT TTC CTT TTT TAA CTA CGC CGA TTC 
Asp Gly Leu Lys Asn Glu Gly Leu Lys Glu Lys He Asp Ala Ala Lys> 
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■ oepC-B31 



390 400 



410 420 430 



AAA TGT TCT GAA ACA 3TT ACT AAT AAA TTA AAA CAA AAA CAC ACA GAT 
TTT ACA AGA CTT TGT AAA TGA TTA TTT AAT TTT CTT TTT GTG TGT CTA 
Lys Cys Ser Glu Thr Phe Thr Asn Lys Leu Lys Glu Lys His Thr Asp> 



440 



450 460 470 480 



CTT GGT AAA GAA GGT GTT ACT GAT GCT GAT GCA AAA GAA GCC ATT TTA 
GAA CCA TTT CTT CCA CAA TGA CTA CGA CTA CGT TTT CTT CGG TAA AAT 
Leu Gly Lys Glu Gly Val Thr Asp Ala Asp Ala Lys Glu Ala Jle ,Leu> 



490 



500 510 . " 520 



AAA ACA AAT GGT ACT AAA ACT AAA GGT GCT GAA GAA CTT GGA AAA TTA 
TTT TGT TTA CCA TGA TTT TGA TTT CCA CGA CTT CTT GAA CCT TTT AAT 
Lys Thr Asn Gly Thr Lys Thr Lys Gly Ala Glu Glu Leu Gly Lys Leu> 



530 



540 550 560 570 



TTT GAA TCA GTA GAG GTC TTG TCA AAA GCA GCT AAA GAG ATG CTT GCT 
AAA CTT AGT CAT CTC CAG AAC AGT TTT CGT CGA TTT CTC TAC GAA CGA 
Phe Glu Ser Val Glu Val Leu Ser Lys Ala Ala Lys Glu Met Leu Ala> 



580 



590 600 610 620 



AAT TCA GTT AAA GAG CTT ACA AGC CCT GTT GTG GCA GAA AGT CCA AAA 
TTA AGT CAA TTT CTC GAA TGT TCG GGA CAA CAC CGT CTT TCA GGT TTT 
Asn Ser Val Lys Glu Leu Thr Ser Pro Val Val Ala Glu Ser Pro Lys> 



630 
* * 

AAA CCT TAA 
TTT GGA ATT 
Lys Pro 
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oe P ^8 aa//33 

Sequence Range: 1 to 630 ' 



10 20 30 40 

• * * •*« * 

ATG AAA AAG AAT ACA TTA AGT GCG ATA TTA ATG ACT TTA TTT TTA TTT 
TAC TTT TTC TTA TGT AAT TCA CGC TAT AAT TAC TGA AAT AAA AAT AAA 
Met Lys Lys Asn Thr Leu Ser Ala lie Leu Met Thr Leu Phe Leu Phe> 

SO 60 70 80 SO 

« * * • * « * * « # 

ATA TCT TGT AAT AAT TCA GGT GGG GAT ACC GCA TCT ACT AAT CCT GAT 
TAT AGA ACA TTA TTA AGT CCA CCC CTA TGG CGT AGA TGA TTA GGA CTA 
lie Ser Cys Asn Asn Ser Gly Gly Asp Thr Ala Ser Thr Asn Pro Asp> 

100 110 120 130 140 

* * * * « « * « # 

GAG TCT GCA AAA GGA CCT AAT CTT ACA GTA ATA AGC AAA AAA ATT ACA 
CTC AGA CGT TTT CCT GGA TTA GAA TGT CAT TAT TCG TTT TTT TAA TGT 
Glu Ser Ala Lys Gly Pro Asn Leu Thr Val lie Ser Lys Lys lie Thr> 

150 160 170 180 190 

GAT TCT AAT GCA TTT GTA CTG GCT GTG AAA GAA GTT GAG GOT TTC ATC 
CTA AGA TTA CGT AAA CAT GAC CGA CAC TTT CTT CAA CTC CGA AAC TAG 
Asp Ser Asn Ala Phe Val Leu Ala Val Lys Glu Val Glu Ala Leu Ile> 

200 210 220 230 240 

* * * * *** *«« 

TCA TCT ATA GAT GAA CTT GCT AAT AAA GCT ATT GGT AAA GTA ATA CAT 
AGT AGA TAT CTA CTT GAA CGA TTA TTT CGA TAA CCA TTT CAT TAT GTA 
Ser Ser lie Asp Glu Leu Ala Asn Lys Ala lie Gly Lys Val He His> 



2S0 260 





* 






* 




• 




CAA 


AAT 


AAT 


GGT 


TTA 


AAT 


GCT 


AAT 


GTT 


TTA 


TTA 


CCA 


AAT 


TTA 


CGA 


TTA 


Gin 


Asn 


Asn 


Gly 


Leu 


Asn 


Ala 


Asn 


290 






300 






310 


« 




« 


* 




* 




« 


GCA 


GGA 


GCC 


TAT 


GCA 


ATA 


TCA 


ACC 


CGT 


CCT 


CGG 


ATA 


CGT 


TAT 


AGT 


TGG 


Ala Gly 


Ala 


Tyr 


Ala 


lie 


Ser 


Thr 


340 




350 






360 






* 




* 




* 


* 


TTG 


AAA 


AAT 


TCA 


GAA 


GAG 


TTA 


AAT 


AAC 


TTT 


TTA 


AGT 


CTT 


CTC 


AAT 


TTA 


Leu 


Lys 


Asn 


Ser 


Glu 


Glu 


Leu 


Asn 



270 280 



* 


* 




* 






* 




GCG 


GGT 


CAA 


AAC 


GGA 


TCA 


TTG 


TTA 


CGC 


CCA 


GTT 


TTG 


CCT 


AGT 


AAC 


AAT 


Ala Gly 


Gin 


Asn 


Gly 


Ser 


Leu 


Leu> 




320 






330 






* 




* 




* 


« 




• 


CTA 


ATA 


ACA 


GAA 


AAA 


TTA 


AGT 


AAA 


GAT 


TAT 


TGT 


CTT 


TTT 


AAT 


TCA 


TTT 


Leu 


lie 


Thr 


Glu 


Lys 


Leu 


Ser 


Lys> 




* 


370 


* 


380 




AAA 


AAA 


ATT 


GAA 


GAG 


GCT 


AAG 


AAC 


TTT 


TTT 


TAA 


CTT 


CTC 


CGA 


TTC 


TTG 


Lys 


Lys 


He 


Glu 


Glu 


Ala 


Lys 


Asn> 
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SL3//33 





390 






« 






« 


CAT 


TCT 


GAA 


GCA 


GTA 


AGA 


CTT 


CGT 


His 


Ser 


Glu 


Ala 




440 




« 








GGA 


GTT 


GCT 


GCT 


CCT 


CAA 


CGA 


CGA 


Gly 


Val 


Ala 


Ala 



400 410 420 



4S0 460 470 480 



490 500 510 520 





• 




* 


* 




* 


* 


* • 








* 






TCA 


AAT 


CCT 


ACT 


AAA 


GAT 


AAG 


GGT GCT 


AAA 


GCA 


CTT 


AAA 


GAC 


TTA 


TCT 


AGT 


TTA 


GGA 


TGA 


TTT 


CTA 


TTC 


CCA CGA 


TTT 


CGT 


GAA 


TTT 


CTG 


AAT AGA 


Ser 


Asn 


Pro 


Thr 


Lys 


Asp 


Lys Gly Ala 


Lys 


Ala 


Leu 


Lys 


Asp 


Leu 


Ser> 


530 






540 






550 


560 






570 










* 


* 




* 




* • 




« 




* 


• 






GAA 


TCA 


GTA 


GAA 


AGC 


TTG 


GCA 


AAA GCA 


GCG 


CAA 


GAA 


GCA 


TTA 


GCT 


AAT 


CTT 


AGT 


CAT 


CTT 


TCG 


AAC 


CGT 


TTT CGT 


CGC 


GTT 


CTT 


CGT 


AAT 


CGA 


TTA 


Glu 


Ser 


Val 


Glu 


Ser 


Leu 


Ala 


Lys Ala 


Ala 


Gin 


Glu 


Ala 


Leu 


Ala 


Asn> 


580 


* 


590 
* 




* 


600 
* 




610 
* 


* 


620 




TCA 


GTT 


AAA 


GAA 


CTT 


ACA 


AAT 


CCT GTT 


GTG 


GCA 


GAA 


AGT 


CCA 


AAA 


AAA 


AGT 


CAA 


TTT 


CTT 


GAA 


TGT 


TTA 


GGA CAA 


CAC 


CGT 


CTT 


TCA 


GGT 


TTT 


TTT 


Ser 


Val 


Lys 


Glu 


Leu 


Thr 


Asn 


Pro Val 


Val 


Ala 


Glu 


Ser 


Pro 


Lys 


Lys> 



630 
« « 

CCT TAA 
GGA ATT 
Pro ***> 
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OapC-PKO 

Sequence Range: 1 to 639 



10 20 30 40 





* 




« 


* 




* 




• 


* 


* 






* 




ATG 


AAA 


AAG 


AAT 


ACA 


TTA 


AGT 


GCG 


ATA 


i i A 


ATU act 


TTA 


TTT 


TTA 


TTT 


TAC 


TTT 


TTC 


TTA 


TGT 


AAT 


TCA 


CGC 


TAT 


AAT 


TAC TGA 


AAT 


AAA 


AAT 


AAA 


Met 


Lys 


Lys 


Asn 


Thr 


Leu 


Ser 


Ala 


He 


Leu 


Met Thr 


Leu 


Phe 


Leu 


Phe> 


50 






60 






70 






80 




90 






« 






* 




* 




* 


« 




* 


* 


* 




* 


ATA 


TCT 


TGT 


AGT 


AAT 


TCA 


GGG 


AAA 


GGT 


GGG 


GAT TCT 


GCA 


TCT 


ACT 


AAT 


TAT 


AGA 


ACA 


TCA 


TTA 


AGT 


CCC 


TTT 


CCA 


CCC 


CTA ^ AGA 


CGT 


AGA 


TGA 


TTA 


He 


Ser 


Cys 


Ser 


Asn 


Ser 


Gly 


Lys 


Gly 


Gly 


Asp Ser 


Ala 


Ser 


Thr 


Asn> 


ioo 




110 






120 






130 




140 . 






* 


* 




* 




* 


* 




* 


* 


* 




« 




CCT 


GCT 


GAC 


GAG 


TCT 


GCG 


AAA 


GGG 


CCT 


AAT 


CTT ACA 


GAA 


ATA 


AGC 


AAA 


GGA 


CGA 


CTG 


CTC 


AGA 


CGC 


TTT 


CCC 


GGA 


TTA 


GAA TGT 


CTT 


TAT 


TCG 


TTT 


Pro 


Ala 


Asp 


Glu 


Ser 


Ala 


Lys 


Gly 


Pro 


Asn 


Leu Thr 


Glu- 


He 


Ser 


Lys> 



* 


150 
« 






160 
* 


« 


170 
• 




• 


180 
* 




« 


190 

i 

« 


AAA 


ATT 


ACA 


GAT 


TCT AAT 


GCA 


TTT GTA 


CTT 


GCT 


GTT 


AAA 


GAA 


GTT GAG 


TTT 


TAA 


TGT 


CTA 


AGA TTA 


CGT 


AAA CAT 


GAA 


CGA 


CAA 


TTT 


CTT 


CAA CTC 


Lys 


He 


Thr 


Asp 


Ser Asn 


Ala 


Phe Val 


Leu 


Ala 


Val 


Lys 


Glu 


Val Glu> 





200 






210 






220 




1 230 






240 




* 




* 


* 




* 


* 


* 


* 




• 


* 


ACT 


TTG GTT 


TTA 


TCT 


ATA 


GAT 


GAA 


CTT GCT 


AAG 


AAA GCT 


ATT 


GGT 


CAA 


TGA 


AAC CAA 


AAT 


AGA 


TAT 


CTA 


CTT 


GAA CGA 


TTC 


TTT CGA 


TAA 


CCA 


GTT 


Thr 


Leu Val 


Leu 


Ser 


He 


Asp 


Glu 


Leu Ala 


Lys 


Lys Ala 


He 


Gly 


Gln> 



250 260 270 280 









* 


* 




* 


* 


• 










« 




AAA 


ATA 


GAC 


AAT 


AAT 


AAT 


GGT TTA 


GCT 


GCT 


TTA 


AAT 


AAT 


CAG 


AAT 


GGA 


TTT 


TAT 


CTG 


TTA 


TTA 


TTA 


CCA AAT 


CGA 


CGA 


AAT 


TTA 


TTA 


GTC 


TTA 


CCT 


Lys 


He 


Asp 


Asn 


Asn 


Asn 


Gly Leu 


Ala 


Ala 


Leu 


Asn 


Asn 


Gin 


Asn 


Gly> 


290 






300 






310 




320 






330 






* 




* 


• 




* 


* 


* 




* 




* 


* 




* 


TCG 


TTG 


TTA 


GCA 


GGA 


GCC 


TAT GCA 


ATA 


TCA 


ACC 


CTA 


ATA 


ACA 


GAA 


AAA 


AGC 


AAC 


AAT 


CGT 


CCT 


CGG 


ATA CGT 


TAT 


AGT 


TGG 


GAT 


TAT 


TGT 


CTT 


TTT 


Ser 


Leu 


Leu 


Ala Gly 


Ala 


Tyr Ala 


He 


Ser 


Thr 


Leu 


He 


Thr 


Glu 


Lys> 



340 350 360 

« « * « * 

TTG AGT AAA TTG AAA AAT TTA GAA 

AAC TCA TTT AAC TTT TTA AAT CTT 

Leu Ser Lys Leu Lys Asn Leu Glu 



370 380 
* • * * 

GAA TTA AAG ACA GAA ATT GCA AAG 
CTT AAT TTC TGT CTT TAA CGT TTC 
Glu Leu Lys Thr Glu He Ala Lys> 
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OspC-PKO 

410 420 430 

390 400 4 10 

GCT AAG AAA TGT TCC GAA GAA TTT ACT AAT AAA CTA AAA AGT GGT CAT 
CGA TTC TTT ACA AGGCTT CTT AAA TGA TTA TTT GAT TTT TCA CCA GTA 
Ala Lys Lys Cys Ser Glu Glu Phe Thr Asn Lys Leu Lys Ser Gly Hxs> 

450 460 470 480 
. 

GCA GAT CTT GGC AAA CAG GAT GCT ACC GAT GAT CAT GCA AAA GCA GCT 
CGT CTA GAA CCG TTT GTC CTA CGA TGG CTA CTA GTA CGT TTT CGT CGA 
Ala Asp Leu Gly Lys Gin Asp Ala Thr Asp Asp His Ala Lys Ala Ala> 



490 



500 510 "* * 520 



ATT TTA AAA ACA CAT GCA ACT ACC GAT AAA GGT GCT AAA GAA TTT AAA 
TAA AAT TTT TGT GTA CGT TGA TGG CTA TTT CCA CGA TTT CTT AAA TTT 
lie Leu Lys Thr His Ala Thr Thr Asp Lys Gly Ala Lys Glu Phe Lys> 

530 ^ 540 ^ 550 ^ 560 ^ 570 

CAT TTA TTT GAA TCA GTA GAA GGT TTG TTA AAA GCA GCT CAA GTA GCA 
CTA AAT AAA CTT AGT CAT CTT CCA AAC AAT TTT CGT CGA GTT CAT CGT 
Asp Leu Phe Glu Ser Val Glu Gly Leu Leu Lys Ala Ala Gin Val Ala> 



580 590 



600 610 620 



CTA ACT AAT TCA GTT AAA GAA CTT ACA AGT CCT GTT GTA GCA GAA AGT 
GAT TGA TTA AGT CAA TTT CTT GAA TGT TCA GGA CAA CAT CGT CTT TCA 
Lei S£ Asn Ser Val Lys Glu Leu Thr Ser Pro Val Val Ala Glu Ser> 



630 



CCA AAA AAA CCT TAA 
GGT TTT TTT GGA ATT 
Pro Lys Lys Pro ***> 
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Sequence Range: 1 to 624 





* 


10 
« 


* 




20 




« 


30 




* 




40 
* * 




ATG 
TAC 
Met 


AAA 
TTT 
Lys 


AAG 
TTC 
Lys 


AAT 
TTA 
Asn 


ACA 
TGT 
Thr 


TTA 
AAT 
Leu 


AGT 
TCA 
Ser 


GCG 
CGC 
Ala 


ATA 
TAT 
He 


TTA 
AAT 
Leu 


ATG 
TAC 
Met 


ACT 
TGA 
Thr 


TTA 
AAT 
Leu 


TTT TTA 
AAA AAT 
Phe Leu 


TTT 
AAA 
Phe> 


50 
* 




* 


60 
* 




« 


70 
* 


* 




80 




« 


90 


* 


ATA TCT 
TAT AGA 
He Ser 


TGT 
ACA 

cys 


AAT 
TTA 
Asn 


AAT 
TTA 
Asn 


AGT 
Ser 


\jr\J 1 

CCA 
Gly 


GGG 
CCC 
Gly 


GAT 
CTA 
Asp 


AGA 


GCA TCT ACT AAT CCT 
CGi AGA TGA TTA GGA 
Ala^Ser Thr Asn Pro 


GAT 
CTA 
Asp> 


100 
* 


* 


110 
* 






120 
* 




* 


130 
« 


* 


140" 
* 




CTC 
Glu 


TCT 
AGA 
Ser 


GCA 
CGT 
Ala 


AAA 
TTT 
Lys 


GGA 
CCT 
Gly 


CCT 
GGA 
Pro 


AAT 
TTA 
Asn 


CTT 
GAA 
Leu 


ACC 
TGG 
Thr 


GTA 
CAT 
Val 


ATA 
TAT 
He 


AGC AAA AAA ATT 
TCG TTT TTT TAA 
Ser Lys Lys He 


ACA 
TGT 
Thr> 


* 


150 
% 




• 


160 
* 


* 


170 
* 




• 


180 
* 




190 
* * 


GAT 
CTA 
Asp 


TCT 
AGA 
Ser 


AAT 
TTA 
Asn 


GCA 
CGT 
Ala 


TTT 
AAA 
Phe 


TTA 
AAT 
Leu 


CTG 
GAC 
Leu 


GCT 
CGA 
Ala 


GTG 
CAC 
Val 


AAA 
TTT 
Lys 


GAA 
CTT 
Glu 


GTT 
CAA 
Val 


GAG 
CTC 
Glu 


GCT TTG 
CGA AAC 
Ala Leu 


CTT 
GAA 
Leu> 


« 


200 
« 




* 


210 
* 




* 


220 
* 


* 


230 
* 


• 


240 


TCA 
AGT 
Ser 


TCT 
AGA 
Ser 


ATA 
TAT 
He 


GAT 
CTA 
Asp 


GAA 
CTT 
Glu 


CTT 
GAA 
Leu 


TCT 
AGA 
Ser 


AAA 
TTT 
Lys 


GCT 
CGA 
Ala 


ATT 
TAA 
He 


GGT 
CCA 
Gly 


AAA 
TTT 
Lys 


AAA 
TTT 
Lys 


ATA AAA 
TAT TTT 
He Lys 


* 

AAT 
TTA 
Asn> 




« 


250 
* 


* 


260 
* 




« 


270 
• 




• 


280 

« * 




GAT 
CTA 
Asp 


GGT 
CCA 
Gly 


ACT 
TGA 
Thr 


TTA 
AAT 
Leu 


GAT 
CTA 
Asp 


AAC 
TTG 
Asn 


GAA 
CTT 
Glu 


GCA 
CGT 
Ala 


AAT 
TTA 
Asn 


CGA 
GCT 
Arg 


AAC 
TTG 
Asn 


GAA 
CTT 
Glu 


TCA 
AGT 
Ser 


TTG ATA 
AAC TAT 
Leu He 


GCA 
CGT 
Ala> 


290 
* 




* 


300 
• 




# 


310 
* 


• 


320 
* 




« 


330 
• 


* 


GGA 
CCT 
Gly 


GCT 
CGA 
Ala 


TAT 
ATA 
Tyr 


GAA 
CTT 
Glu 


ATA 
TAT 
He 


TCA 
AGT 
Ser 


AAA 
TTT 
Lys 


CTA 
GAT 
Leu 


ATA 
TAT 
He 


ACA CAA 
TGT GTT 
Thr Gin 


AAA 
TTT 
Lys 


TTA 
AAT 
Leu 


AGT GTA 
TCA CAT 
Ser Val 


TTG 
AAC 
Leu> 


340 
♦ 




350 
* 




* 


360 
* 




* 


370 
* 


* 


380 




AAT 
TTA 
Asn 


TCA 
AGT 
Ser 


GAA 
CTT 
Glu 


GAA 
CTT 
Glu 


TTA 
AAT 
Leu 


AAG 
TTC 
Lys 


AAA 
TTT 
Lys 


AAA 
TTT 
Lys 


ATT 
TAA 
He 


AAA GAG GCT 
TTT CTC CGA 
Lys Glu Ala 


AAG 
TTC 
Lys 


• * 

GAT TGT 
CTA ACA 
Asp Cys 


TCC 
AGG 
Ser> 
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OopC-TRO 



390 




400 



410 



420 



430 



GAA AAA TTT ACT ACT AAG CTA AAA GAT AGT CAT GCA GAG CTT GGT ATA 
CTT TTT AAA TGA TGA TTC GAT TTT CTA TCA GTA CGT CTC GAA CCA TAT 
Glu Lys Phe Thr Thr Lys Leu Lys Asp Ser His Ala Glu Leu Gly Ile> 



440 



450 



460 



470 



460 



CAA AGC GTT CAG GAT GAT AAT GCA AAA AAA GCT ATT TTA AAA ACA CAT 
GTT TCG CAA GTC CTA CTA TTA CGT TTT TTT CGA TAA AAT TTT TGT GTA 
Gin Ser Val Gin Asp Asp Asn Ala Lys Lys Ala He Leu Lys Thr His> 



490 



500 



510 



520 



GGA ACT AAA GAC AAG GGT GCT AAA GAA CTT GAA GAG TTA TTT AAA TCA 
CCT TGA TTT CTG TTC CCA CGA TTT CTT GAA CTT CTC AAT AAA TTT AGT 
Gly Thr Lys Asp Lys Gly Ala Lys Glu Leu Glu Glu Leu Phe Lys Ser> 



530 



540 



550 



560 



570 



CTA GAA AGC TTG TCA AAA GCA GCG CAA GCA GCA TTA ACT AAT TCA GTT 
GAT CTT TCG AAC AGT TTT CGT CGC GTT CGT CGT AAT TCA TTA AGT CAA 
Leu Glu Ser Leu Ser Lys Ala Ala Gin Ala Ala Leu Thr Asn Ser Val> 



580 



S90 



600 



610 



620 



AAA GAG CTT ACA AAT CCT GTT GTG GCA GAA AGT CCA AAA AAA CCT TAA 
TXT CTC GAA TGT TTA GGA CAA CAC CGT CTT TCA GGT TTT TTT GGA ATT 
Lys Glu Leu Thr Asn Pro Val Val Ala Glu Ser Pro Lys Lys Pro •••> 
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P93 / 
Sequence Range: 1 to 2102 

10 20 30 40 

.«• « • • 

ATG AAA AAA ATG TTA CTA ATC TTT AGT TTT TTT CTT ATT TTC TTG AAT 
TAC TTT TTT TAC AAT GAT TAG AAA TCA AAA AAA GAA TAA AAG AAC TTA 
Met Lys Lys Met Leu Leu lie Phe Ser Phe Phe Leu He Phe Leu Asn> 



50 60 



70 80 90 



GGA TTT CCT GTT AGT GCA AGA GAA GTT GAT ,AGG GAA AAA TTA AAG GAC 
CCT AAA GGA CAA TCA CGT TCT CTT CAA CTA TGC--CTT TTT AAT TTC CTG 
Gly Phe Pro Val Ser Ala Arg Glu Val Asp Arg Glu Lys Leu Lys Asp> 



100 



110 120 130 140 



TTT GTT AAT ATG GAT CTT GAG TTT GTA AAT TAT AAA GGC CCT TAT GAT 
AAA CAA TTA TAC CTA GAA CTC ;. -\ CAT TTA ATA TTT- CCG GGA ATA CTA 
Phe Val Asn Met Asp Leu Glu Phe Val Asn Tyr Lys Gly Pro Tyr Asp> 



ISO 



160 no 180 190 



TCT ACA AAT ACA TAT GAA CAA ATA GTG GGT ATT GGG GAG TTT TTA GCA 
AGA TGT TTA TGT ATA CTT GTT TAT CAC CCA TAA CCC CTC AAA AAT CGT 
Ser Thr Asn Thr Tyr Glu Gin lie Val Gly lie Gly Glu Phe Leu Ala> 

200 210 220 230 240 

AGA CCG TTG ACC AAT TCC AAT AGC AAC TCA AGT TAT TAT GGT AAA TAT 
TCT GGC AAC TGG TTA AGG TTA TCG TTG AGT TCA ATA ATA CCA TTT ATA 
Arg Pro Leu Thr Asn Ser Asn Ser Asn Ser Ser Tyr Tyr Gly Lys Tyr> 



250 



260 270 280 



TTT ATT AAT AGA TTT ATT GAT GAT CAA GAT AAA AAA GCA AGC GTT GAT 
AAA TAA TTA TCT AAA TAA CTA CTA GTT CTA TTT TTT CGT TCG CAA CTA 
Phe He Asn Arg Phe He Asp Asp Gin Asp Lys Lys Ala Ser Val Asp> 

290 300 310 320 330 

GTT TTT TCT ATT GGT AGT AAG TCA GAG CTT GAC AGT ATA TTG AAT TTA 
CAA AAA AGA TAA CCA TCA TTC AGT CTC GAA CTG TCA TAT AAC TTA AAT 
Val Phe Ser He Gly Ser Lys Ser Glu Leu Asp Ser He Leu Asn Leu> 

340 350 360 370 380 

AGA AGA ATT CTT ACA GGG TAT TTA ATA AAG TCT TTC GAT TAT GAC AGG 
TCT TCT TAA GAA TGT CCC ATA AAT TAT TTC AGA AAG CTA ATA CTG TCI. 
Arg Arg He Leu Thr Gly Tyr Leu He Lys Ser Phe Asp Tyr Asp Arg> 
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390 
* 




* 


400 
• 


* 


410 
* 




* 


420 
* 






4 


30 
• 


TCT 
AGA 
Ser 


AGT 
TCA 
Ser 


GCA 
CGT 
Ala 


GAA 
CTT 
Glu 


TTA 
AAT 
Leu 


ATT 
TAA 
He 


GCT 
CGA 
Ala 


AAG 
TTC 
Lys 


GTT 
CAA 
Val 


ATT 
TAA 
He 


ACA 
TGT 
Thr 


ATA TAT 
TAT ATA 
He Tyr 


AAT 
TTA 
Asn 


GCT 
CGA 
Ala 


GTT 
CAA 
Val> 


* 


440 
* 




« 


450 




• 


460 
* 




470 
• 




• 


480 


TAT 
ATA 
Tyr 


AGA 
TCT 
Arg 


GGA 
CCT 
Gly 


GAT 
CTA 
Asp 


TTG 
AAC 
Leu 


GAT 
CTA 
Asp 


TAT 
ATA 
Tyr 


TAT 
ATA 
Tyr 


AAA 
TTT 
Lys 


GGG 
CCC 
Gly 


TTT 
AAA 
Phe 


TAT 
ATA 
Tyr 


ATT 
lie 


GAG 
CTC 
Glu 


GCT 
CGA 
Ala 


* 

GCT 
CGA 
Ala> 




• 


490 
* 


* 


500 
• 




* 


510 
* * 




* « 


520 


« 


* 


TTA 
AAT 
Leu 


AAG 
TTC 
Lys 


TCT 
AGA 

Ser 


TTA 
AAT 
Leu 


AGT 

iLA 

Ser 


AAA 
TTT 
Lys 


GAA 
CTT 
Glu 


AAT 
TTA 

Asn 


GCA 
CGT 
Ala 


GGT 
CCA 
Gly 


CTT 
GAA 
Leu 


TCT 
AGA 

Ser 


AGG 
TCC 
Arg 


GTT 
CAA 
Val 


TAT 
ATA 
Tyr 


AGT 
TCA 
Ser> 


530 




* 


540 
* 




* 


550 
* 


* 


560 
• 




• 


570 
* 






CAG 
GTC 
Gin 


TGG 
ACC 
Trp 


GCT 
CGA 
Ala 


GGA 
CCT 
Gly 


AAG 
TTC 
Lys 


ACA 
TGT 
Thr 


CAA 
GTT 
Gin 


ATA 
TAT 
He 


AAA 
Phe 


ATT 
TAA 
He 


CCT 
GGA 
Pro 


CTT 
GAA 
Leu 


AAA 
TTT 
Lys 


AAG 
TTC 
Lys 


GAT 
CTA 
Asp 


* ~'*i> 

. AA 
Ile> 


580 
* 


* 


590 
• 




• 


600 
* 




• 


610 
• 


* 


620 




TTG 
AAC 
Leu 


TCT 
AGA 
Ser 


GGA 
CCT 
Gly 


AAT 
TTA 
Asn 


ATT 
TAA 
He 


GAG 
CTC 
Glu 


TCT 
AGA 
Ser 


GAC 
CTG 
Asp 


ATT 
TAA 
He 


GAT 
CTA 
Asp 


ATT 
TAA 
He 


GAC 
CTG 
Asp 


AGT 
TCA 
Ser 


TTA 
AAT 
Leu 


GTT 
CAA 
Val 


ACA 
Thr> 




630 
• 




* 


640 


« 


650 
♦ 




• 


6€0 






670 


GAT AAG 
CTA TTC 
Asp Lys 


GTG 
CAC 
Val 


GTG 
CAC 
Val 


GCA 
CGT 
Ala 


GCT 
CGA 
Ala 


CTT 
GAA 
Leu 


TTA 
AAT 
Leu 


AGT 
TCA 
Ser 


GAA 
CTT 
Glu 


AAT 
TTA 
Asn 


GAA 
Glu 


GCA 
CGT 
Ala 


GGT GTT 
CCA CAA 
Gly Val 


AAC 
TTC- 
Asr.> 


• 


680 
* 




• 


690 
« 




• 


700 
* 


• 


710 




* 


72: 


TTT 
AAA 
Phe 


GCA 
CGT 
Ala 


AGA 
TCT 
Arg 


GAT 
CTA 
Asp 


ATT 
TAA 
He 


ACA 
TGT 
Thr 


GAT 
CTA 
Asp 


ATT 
TAA 
He 


CAA GGC GAA ACT 
GTT CCG CTT TGA 
Gin Gly Glu Thr 


CAT 
GTA 
His 


AAG 
TTC 
Lys 


GCA 
CGT 
Ala 


GAT 
CTA 
Asp> 




* 


730 
* 


« 


740 
* 




* 


750 
* 




* 


760 
* 


• 




CAA 
GTT 
Glr. 


GAT 
CTA 
Asp 


AAA 
TTT 
Lys 


ATT 
TAA 
He 


GAT 
CTA 
Asp 


ATT GAA 
TAA CTT 
He Glu 


TTA 
AAT 
Leu 


GAC AAT 
CTG TTA 
Asp Asn 


ATT 
TAA 
He 


CAT 
GTA 
His 


GAA 
CTT 
Glu 


AGT GAT 
TCA CTA 
Ser Asp 


TCC 
AGG 
Ser> 


770 
* 




* 


780 




• 


790 
• 


* 


800 




* 


810 
• 




• 


AAT 
TTA 
Asn 


ATA 
TAT 
He 


ACA 
TGT 
Thr 


GAA 
CTT 
Glu 


ACT 
TGA 
Thr 


ATT 
TAA 
He 


GAA 
CTT 
Glu 


AAT 
TTA 
Asn 


TTA 
AAT 
Leu 


AGG 
TCC 
Arg 


GAT CAG 
CTA GTC 
Asp Gin 


CTT 
GAA 
Leu 


GAA 
CTT 
Glu 


AAA 
TTT 
Lys 


GCT 
CGA 
Ala> 
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820 830 840 850 8 60 . 

ACA GAT GAA GAG CAT AAA AAA GAG ATT 0\A AGT CAG GTT GAT GCT AAA 



Thr 


Asp 


Glu 


Glu 


His 


Lys 


Lys Glu 
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1 ATGAAAAAAT TGTTACTAAT CTTTAGTTTT TTTCTTATTT CTTTGAATGG ATTTCCTCTT 
61 AATTCAAGGG AAGTTGATAA GGAAAAATTA AAGGATTTTG TTAATATGGA TCTTGAGTTT 
121 GTAAACTATA AAGGTCCTTA TGATTCTACA AATACATATG AACAAATAGT AGGTATTGGT 
181 GAGTTTTTAG CAAGACCATT GATTAATTCC AATAGCAACT CAATTTATTA TGGTAAATAT 
241 TTTATTAATA GATTTATTGA TGATCAAGAT AAAAAAGCAA GCGTTGATGT TTTTTCTATT 
301 GGTAGTAGGT CACAGCTTGA CAGTATATTG AATCTAAGAA GAATTCTTAC AGGGTATTTG 
361 ATAAAGTCTT TTGATTATGA AAGATCTAGT GCTGAATTAA TTGCTAAGGT TATTACAATA 
421 CATAATGCTG TTTATAGAGG GGATTTAAAT TATTATAAAG AGGTTTATAT TGAGGCTGCT 
481 TTAAAGTCTT TAACTAAAGA AAATGCAGGT CTTTCTAGAG TGTACAGTCA ATGGGCTGGA 
S41 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAAAGT TGAGTCTGAC 
601 ATTGATATTG ACAGTTTGGT TACAGATAAG GTTGTGGCAG -GTGTTTTAAG CGAGAATGAA 
661 GCAGGTGTTA ACTTTGCAAG AGATATTACA GATATTCAAG GCGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT GTTCATAAAA GTGATTCCAA TATAACAGAG 
781 ACTATTGAGA ATTTAAGAGA TCAGCTTGAA AAGGCTACAG ATGAAGAGCA TAGAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA CAAAAAGAAG AACTAGATAA AAAGGCAATC 
901 GATCTTGATA AAGCCCAACA AAAATTAGAT TCTTCTGAAG ATAATTTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGAT TCAAGAGGAT ATTGACGAGA TTAATAAAGA AAAGAATTTG 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AGCTACAAAT AAAAGAGAGT 
1081 CTAGAAGACT TGCAGGAACA GCTTAAAGAA ACTAGCGATG AAAATCAAAA AAGAGAAATT 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT GAAGAACTTT TAAAAAGTAA AGATCCTAAA 
1201 GCATTAGATC TTAATGGAGA TTTAAATTCT AAAGTTTCTA GTAAAGAAAA AATTAAAGGC 
1261 AAAGAAGGAG AAATAGTCAA AGAGGAATCA AAGGCAAGTT TAGCTGATTT GAATAATGAC 
1321 GAAAATCTTA TGAGGCCGGA AGATCAAAAA TTATCTGAGG ATAAAAAATT AGATAGTAAA 
1381 AAAAATTTAA AACCTGTTTC TGAGATTGAG AGAGTAAATG AAATTTCGAA GTCTAACAAC 
1441 AATGAGATTA GTGAATCATC ACCATTATAT AAGCCTTCTT ATAGCGATAT GGATTCAAAA 
1501 GAGGGTATAG ATAATAAAGA TGTTAACTTG CAAGAAACCA AGTCTCAAAC TAAAAGTCAA 
1561 CCTACTTCTT TAAATCAAGA TTTGACTACT ATGTCTATAG ATTCTAGTAA TCGTGTATTT 
1621 TTAGAGGTTA TTGATCCTAT TACAAATTTA GGAACGCTTC AACTTATTGA TTTGAATACC 
1681 GGTGTTAGAC TTAAAGAAAG TACTCAGCAA GGCATTCAGC GGTATGGAAT TTATGAACGT 
1741 GAAAAAGATT TAGTTGTTAT TAAAATGGAT TCAGGAAAAG CCAAGCTTCA AATACTTAAT 
1801 AAACTTGAGA ATTTAAAAGT GATATCGGAG TCTAATTTTG AGATTAATAA A^TTCATCT 
1861 CTTTATGTTG ACTCTAAAAT GATTTTAGTA GTTGTGAGAG ATAGTGGTAA TGTT TGGAGA 
1921 TTGGCTAAAT TTTCTCCTAA AAATTTAAAT GAGTTTATTC TTTCAGAGAA TAAAA TTTTG 
1981 CCTTTTACTA GCTTTTCTGT GAGAAAGAAT TTTATTTATT 1XJCAGGATGA GTTTAAAAGT 
2041 CTTATTACTT TAGATGTAAA TACTTTAAAA AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA AAGGACTTTG TTAATATGGA TCTTGAATTT 
121 GTTAATTACA AGGGTCCTTA TGATTCTACA GATACATATG AACAAATAGT AGGTATTGGG 
181 GAGTTTTTAG CAAGGCCGTT GAACAATTCC AATAGTAATT CAAGTTATTA TGGTAAATAT 
241 TTTGTTAATA GATTTATTGA CGATCAAGAT AAAAAAGCAA GTGTTGATAT TTTTTCTATT 
301 GGTAGTAAGT CAGAGCTTGA TAGTATATTA AATCTAAGAA GAATTCTTAC AGGGTATTTA 
361 ATGAAGTCTT TTGATTATGA GAGGTCTAGT GCGGAATTAA TTGCTAAAGC TATTACAATA 
421 TATAATGCTG TTTATAGAGG AGATTTAGAT TATTACAAAG AGTTTTATAT TGAGGCTTCT 
481 TTGAAGTCTT TGACTAAAGA AAATGCAGGT CTTTCTAGGG TGTACAGTCA ATGGGCTGGG 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG AATATTTTAT CTGGAAATGT TGAGTGTGAC 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG GTGGTGGCAG CTCTTTTAAG TGAGAATGAA 
661 TCAGGTGTTA ACTTTGCAAG AGATATTACA GACATTCAAG GCGAAACTCA TAAAGCAGAT 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT TTTCATGAAA GTGATTCCAA TATAACAGAA 
781 ACTATTGAGA ATTTAAGGGA TCAGCTTGAA AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA CAAAAGGAAG AATTAGATAA AAAGGCAATT 
901 GATCTTGATA AAGCTCAACA AAAATTAGAT TTTGCTGAAG ATAATCTAGA TATTCAAAGG 
961 GATACTGTTA GAGAGAAGCT TCAAGAAAAT ATTAACGAGA CTAATAAGGA .AAAGAATTTA 
1021 CCAAAGCCTG GTGATGTAAG TTCTCCTAAG GTTGATAAGC AGTTGCAGAT AAAAGAGAGT 
1081 CTAGAAGATT TGCAAGAGCA GCTTAAAGAA GCTAGTGATG AAAATCAAAA AAGAGAAATA 
1141 GAAAAGCAAA TTGAAATCAA AAAAAATGAT GAAGAACTTT TTAAAAATAA AGATCATAAA 
1201 GCATTAGATC TTAAGCAAGA ATTAAATTCT AAAGCTTCTA GTAAAGAAAA AATTGAAGGC 
1261 GAAGAAGAGG ATAAAGAATT AGATAGTAAA AAAAATTTAG AGCCTGTTTC TGAGGCTGAT 
1321 AAAGTAGATA AAATTTCCAA GTCTAACAAC AATGAGGTTA GTAAATTATC CCCGTTAGAT 
1381 GAGCCTTCTT ATAGCGACAT TGATTCGAAA GAGGGTGTAG ATAACAAAGA TGTTGATTTG 
1441 CAAAAAACTA AACCCCAAGT TGAAAGTCAA CCTACTTCGT TAAATGAAGA TTTGATTGAT 
1501 GTGTCTATAG ATTCCAGTAA TCCTGTCTTT TTAGAGGTTA TCpATCCGAT TACAAATTTA 
1561 GGAACGCTTC AACTTATTGA TTTGAATACC GGTGTTAGAC TTAAAGAAAG TGCTCAACAA 
1621 GGTATTCAGC GATATGGAAT TTATGAACGT GAAAAAGATT TGGTTGTTAT TAAAATAGAT 
1681 TCAGGAAAAG CTAAGCTTCA GATACTTGAT AAACTCGAGA ATTTAAAAGT GATATCAGAG 
1741 TCTAATTTTG AGATTAATAA AAATTCAtCT CTTTATGTTG ACTCTAGAAT GATTTTAGTA 
1801 GTTGTTAAGG ACGATAGTAA TGCTTGGAGA TTGGCTAAAT TTTCTCCTAA AAATTTAGAT 
1861 GAATTTATTC TGTCAGAAAA TAAAATTTTG CCTTTTACTA GCTTTGCTGT GAGAAAGAAT 
1921 TTTATTTATT TGCAAGATGA ACTTAAAAGC TTAGTTACTT TAGATGTAAA TACTTTAAAA 
1981 AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA 
121 GTAAACTATA AAGGTCCTTA TGATTCTACA 
181 GAGTTTTTAG CAAGACCATT GATTAATTTC 
241 TTTATTAATA GATTTATTGA CGATCAAGAT 
301 AGTAGTAAGT CACAGCTTGA CAGTATATTG 
361 ATAAAGTCTT TTGATTATGA AAGATCTAGT 
421 CATAATGCTG TTTATAGAGG TGATTTAAAT 
481 TTAAAGTCTT TAACTAAAGA AAATGCAGGT 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG 
661 GCAGGTGTTA ACTTTGCAAG GGATATTACA 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT 
781 ACTATTGAGA ATTTAAGAGA TCAGCTTGAA 
841 ATTGAAAGTC AAGTTGATGC TAAAAAGAAA 
901 GATCTTGATA AAGCCCAACA AAAATTAGAT 
961 GATACTGTTA GAGAGAAGAT TCAAGAGGAT 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA 
1081 CTAGAAGACT TGCAGGAGCA GCTTAAAGAA 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT 
1201 GCATTAGATC TTAATCGAGA TTTAAATTCT 
1261 AAAGAAAAAG AAATAGTCAA AGAGAAATCA 
1321 GAAACCCTTA TGACGCCGGA AGATCAAAAA 
1381 AAAAATTTAA AACCTGTTTC TGAGATTGAG 
1441 AATGAGGTTA GCAAATCATC ACCATTAGAT 
1501 GAGGTTGTAG ATAATAAAGA TGTTAATTTG 
1561 TCTACTTCTT TAAATCAAGA TTTGATTACT 
1621 TTAGAGGTTA TTGATCCTAT TACAAATTTA 
1681 GGTGTTAGAC TTAAAGAAAG CACTCAGCAA 
1741 GAAAAAGATT TAGTTGTTAT TAAAATGGAT 
1801 AAACTTGAGA ATTTAAAAGT GATATCAGAG 
1861 CTTTATGTTG ACTCTAAAAT GATTTTAGTA 
1921 TTGGCTAAAT TTTCTCCTAA AAATTTAGAT 
1981 CCTTTTACTA GCTTTTCTGT GAGAAAGAAT 
2041 CTTATTACTT TAGATGTAAA TACTTTAAAA 



TTTCTTATTT CTTTGAATGG ATTTCCCCTT 
AAGGACTTTG TTAATATGGA TCTTGAGTTT 
AATACATATG AACAAATAGT AGGTATTGGT 
AATAGCAACT CAAGTTATTA TGGTAAATAT 
AAAAAAGCAA GCGTTGATGT TTTTTCTATT 
AATTTAAGAA GAATTCTTAC AGGGTATTTG 
GCTGAATTAA TTGCCAAGGT TATTACAATA 
TATTATAAAG AGTTTTATAT TGAGTCTGCT 
CTTTCTAGAG TGTACAGTCA ATGGGCTGGA 
AATATTTTAT CTGGAAAAAT TGAGTCTGAC 
GTTGTGGGAG -GTGTTTTAAG CGAAAATGAA 
GATATTCAAG GAGAAACTCA TAAAGCAGAT 
GTTCATGAAA GTGATTCCAA TATAACAGAA 
AAGGCTACAG ATGAAGAGCA TAGAAAAGAG 
CAAAAAGAAG AACT^GATAA AAAGGCAATC 
TTTTCTGAAG ATAATTTAGA TATTCAAAGG 
ATTAACGAGA TTAATAAGGA AAAGAATTTA 
GTTGATAAGC AGCTACAAAT AAAAGAGAGT 
ACTAGCGATG AAAATCAAAA AAGAGAAATT 
GAAGAACTTT TAAAAAGCAA AGATCCTAAA 
AAAGCTTCTA GTAAAGAAAA AATTAAAGGC 
AAGGTAAGTT TAGGTGATTT GGATAATGAC 
TTATCTGAGG ATAAAAAATT AGATAGTAAA 
AGAGTAAATG AAATTTCAAA GTCTAACAAC 
AAGCCTTCTT ATAGTGATAT CGATTCAAAA 
CAAGAAACCA AGCCTCAAGC TAAAAGTCAA 
ATGTCTATAG ATTCTAGTAA TCCTGTATTT 
GGAATGCTTC AACTTATTGA TTTAAATACT 
GGCATTCAGC GTTATGGAAT TTATGAACGT 
TCAGGAAAAG CTAAGCTTCA AATACTTAAT 
TCTAATTTTG AGATTAATAA AAATTCATCT 
GCTGTGAAAG ATAGTGGTAA TGTTTGGAGA 
GAGTTTATTC TTTCAGAGAA TAAAATTTTG 
TTTATTTATT TGCAAGATGA GTTTAAAAGT 
AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT 
61 AATGCAAGGG AAGTTGATAA GGAAAAATTA 
121 GTTAATTACA AGGGTCCTTA TGATTCTACA 
181 GAGTTTTTAG CAAGGCCGTT GATCAATTCC 
241 TTTGTTAATA GATTTATTGA CGATCAAGAT 
301 GGTAGTAAGT CAGAGCTTGA TAGTATATTA 
361 ATGAAGTCTT TTGATTATGA* GAGGTCTAGT 
421 TATAATGCTG TTTATAGAGG AGATTTAGAT 
481 TTGAAGTCTT TGACTAAAGA AAATGCAGGT 
541 AAGACACAAA TATTTATTCC TCTTAAAAAG 
601 ATTGATATTG ATAGTTTGGT TACAGATAAG 
661 TCAGGTGTTA ACTTTGCAAG AGATATTACA 
721 CAAGATAAAA TTGATATTGA ATTAGATAAT 
781 ACTATTGAGA ATTTAAGGGA TCAGCTTGAA 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA 
901 GATCTTGATA AAGCTCAACA AAAATTAGAT 
961 GATACTGTTA GAGAGAAGCT TCAAGAGAAT 
1021 CCAAAGCCTG GTGATGTAAG TTCTCCTAAA 
1081 CTGGAAGATT TGCAGGAGCA GCTTAAAGAA 
1141 GAAAAGCAAA TTGAAATCAA AAAAAGTGAT 
1201 GCAAGTAAAG ATGGTAAAGC CTTGGATCTT 
1261 AAAGAAAAAA GTAAAGCCAA GGAAGAAGAA 
1321 GGCGATTTGA ATAATGATGA AAATCTTATG 
1381 AAAAAATTAG ATAGCAAAAA AGAATTTAAA 
1441 ATTTTCAAGT CTAATAACAA TGTTGGAGAA 
1501 GACATTGATT CAAAAGAGGA GACAGTTAAT 
1561 CAGGTTAAAG ACCAAGTTAC TTCTTTGAAT 
1621 AGTAGTCCTG TATTTTTAGA GGTTATTGAT 
1681 ATTGATTTAA ATACTGGTGT TAGGCTTAAA 
1741 GGAATTTATG AACGTGAAAA AGATTTGGTT 
1801 CTTCAGATAC TTGATAAACT TGAAAATTTA 
1861 AATAAAAATT CATCTCTTTA TGTTGATTCT 
1921 GATAGTAGTA ATGATTGGAG ATTGGCCAAA 
1981 CTTTCAGAGA ATAAAATTAT GCCTTTTACT 
2041 TTGCAAGATG AGTTTAAAAG TCTAGTTATT 
2101 TAAAGCC 



TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
AAGGACTTTG TTAATATGGA TCTTGAATTT 
AATACATATG AACAAATAGT AGGTATTGGG 
AATAGTAATT CAAGTTATTA TGGTAAATAT 
AAAAAAGCAA GTGTTGATAT TTTTTCTATT 
AATCTAAGftA GAATTCTTAC AGGGTATTTA 
GCGGAATTAA TTGCTAAAGC TATTACAATA 
TATTACAAAG AGTTTTATAT TGAGGCTTCT 
CTT TCTAG GG TGTACAGTCA ATGGGCTGGG 
AATATTTTAT CTGGAAATGT TGAGTCTGAC 
GTGGTGGCAG CTCTTTTAAG TGAGAATGAA 
GACATTCAAG GCGAAACTCA TAAAGCAGAT 
ATTCATGAAA GTGATTCCAA TATAACAGAA 
AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 
CAAAAGGAAG AATTAGATAA AAAGGCAATT 
TTTGCTGAAG ATAATCTAGA TATTCAAAGG 
ATTAACGAGA CTAATAAGGA AAAGAATTTA 
GTTGATAAGC AACTACAAAT AAAAGAGAGC 
ACTGGTGATG AAAATCAGAA AAGAGAAATT 
GAAAAGCTTT TAAAAAGTAA AGATGATAAA 
GATCGAGAAT TAAATTCTAA AGCTTCTAGC 
ATAACCAAGG GTAAGTCACA GAAAAGCTTA 
ATGCCAGAAG ATCAAAAATT ACCTGAGGTT 
CCTGTTTCTG AGGTTGAGAA ATTAGATAAG 
TTATCACCGT TAGATAAATC TTCTTATAAA 
AAAGATGTTA ATTTGCAAAA GACTAAGCCT 
GAAGATTTGA CTACTATGTC TATAGATTCC 
CCAATTACAA ATTTAGGAAC TCTTCAACTT 
GAAAGCACTC AGCAAGGCAT TCAGCGGTAT 
GTTATTAAAA TGGATTCAGG AAAAGCTAAG 
AAAGTGGTAT CAGAGTCTAA TTTTGAGATT 
AAAATGATTT TAGTAGCTGT TAGGGATAAA 
TTTTGTCCTA AAAATTTAGA TGAGTTTATT 
AGCTTTTCTG TGAGAAAAAA TTTTATTTAT 
TTAGATGTAA ATACTTTAAA AAAAGTTAAG 
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_^ m -„„. rp r-T-T-rhrTTTT TTTCTTGTTT TTTTAAATGG ATTTCCTCTT 
1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTAATATGGA TCTTGAATTT 

61 AATGCAAGGG AAGTTCATAA GGAAAAATTA AACAAATAGT AGGTATTGGG 

]l\ SSSS JSSSS SSSS S?aSaatt CAAGTTATTA T^TAAATAT 
I., SStTATTGA CGATCAAGAT AAAAAAGCAA GTGTTGATAT TTTTTCTATr 

241 ^r^rrTCA TAGTATATTA AATCTAAGAA GAATTCTTAC AGGGTATTTA 

?E SSSS GCGGAATTAA TTGCTAAAGC TATTACAATA 

!S StaSgagg AGATTTAGAT tattacaaag AGTTTTATAT TGAGGCTTCT 

£ SS^SS? tcactaaaga aaatgcaggt ctttctaggg tgtacagtca atgggctggg 
2i IJSSSaI ?Sttcc tcttaaaaag aatattttat ctggaaatgt tgagtctgac 

Im SaOTTCGT TACAGATAAG GTCGTGGCAG J^OTTAAG tgagaatgaa 

S3" JSSS actttocaag agatattaca gacattcaag gcgaaactca taaagcagat - 
™ ?tcSattga attagataat tttcatgaaa gtgattccaa tataacagaa 

721 CAAGATAAAA TTUAiAiJ^ ^^^^ AAAGCTACAG ATGAAGAGCA TAAAAAAGAG 

2£ SIS aSSS ?aaSSa" SSgmg aattagataa aaaggcaatt 
aSSSS aaaattagat tttgctgaag ataatctaga tattcaaagg 
o^n ™S?£?a Sgagaagct tcaagaaaat attaacgaga ctaataagga aaagaattta 

f^S^ gS^AAG TTCTCCTAAG GTTGATAAGC AGTTGCAGAT AAAAGAGAGT 

Sa^SS? tgcaagagca GCTTAAAGAA GCTAGTGATG aaaatcaaaa aagagaaata 
\™\ SSSSaI ?tgaaatcaa aaaaaatgat gaagaacttt ttaaaaataa agatcataaa 

attaaattct aaagcttcta gtaaagaaaa aattgaaggc 
??m SaJSSS Saaagaatt agatagtaaa aaaaatttag agcctgtttc tgaggctgat 
}?$} awSttccaa gtctaacaac aatcaggtta gtaaattatc cccgttagat 

]H\ SaS?S a^Sgacat tgaticgaaa gagggtctag ataacaaaga tgttgmtig 

SSSaSa aacSaact tgaaagtcaa cctacttcgt taaatgaaga cttgattgat 

lit} ATTCCAGTAA TCCTGTCTTT TTAGAGGTTA TCGATCCGAT TACAAATTTA 

\\£ SSSSc aSattga tttgaatacc ggtcttagac ttaaagaaag tgctcaacaa 
gatatggaat ttatcaacgt gaaaaagatt tggttgttat taaaatagat 

S^SSSc CTAAGCTTCA GATACTTGAT AAACTCGAGA ATTTAAAAGT GATATCAGAG 
«X SaATAA AAATTCATCT CTTTATGTTG ACTCTAGAAT GA1TTTAGTA 

lli, S-ritt^ ArfiATAGTAA TGCTTGGAGA TTGGCTAAAT TTTCTCCTAA AAATTTAGAT 

X£ Sa?Sa^? SS ££a££S ccttttacta gctttgctgt gagaaagaat 
"21 ^aSSt tgSagatga acttaaaagc ttagttactt tagatgtaaa tactttaaaa 



1981 AAAGTTAAGT A 
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1 ATGAAAAAAA TGTTACTAAT CTTTAGTTTT TTTCTTATTT TTTTGAATCG ATTTCCTCTT 
61 AATGCAAGGA AAGTTGATAA GGAAAAATTA AAGGATTTTG TTAATATGGA TCTTGAGTTT 
121 GTAAATTATA AAGGTCCTTA TGATTCTACA AATACGTATG AACAAATAGT GGGTATTGGG 
181 GAGTTTTTAG CAAGACCGCT GACCAATTCC AATAGCAACT CAAGTTATTA TGGCAAATAT 
241 TTTATTAATA GATTTATTGA TGATCAAGAT AAAAAAGCAA GTGTTGATGT TTTTTCTATA 
301 AGCAGCAAAT CAGAGCTTGA CAGTATATTG AATTTAAGAA GAATTCTTAC AGGGTATATA 
361 ATAAAGTCTT TCGATTATGA CAGGTCTAGT GCAGAATTAA TTGCTAAGGT TATTACAATA 
421 TATAATGCTG TTTATAGAGG AGATTTGGAT TATTATAAAG GGTTTTATAT TGAGCCTGCT 
481 TTGAAGTCTT TAACTAAAGA AAACGCAGGT CTTTCTAGGG TTTACAGTCA GTGGGCTGGA 
541 AAGACTCAAA TATTTATTCC TCTTAAAAAG GATATTTTGT CTGGAAATAT TGAATCTGAC 
601 ATTGATATTG ACAGTTTGGT TACAGATAAG GTGATAGCAG CTCTTTTAAG CGAAAATGAA 
661 GCAGGCGTTA ACTTTGCAAG AGATATTACA GATATTCAAG GCGAAACTCA TAAGGCAGAT 
721 CAAGATAAGA TTGATACTGA ATTAGACAAT ATCCATGAAA GCGATTCTAA TATAACAGAA 
781 ACTATTGAAA ATTTAAGGGA TCAGCTTGAA AAAGCTACAG .ATCA^GAGCA TAAAAAAGAG 
841 ATTGAAAGTC AGGTTGATGC TAAAAAGAAA GAAAAGGAAG AGCTAGATAA AAAGGCAATC - 
901 AATCTTGATA AAGCTCAGCA AAAATTAGAC TCTGCTGAAG ATAATTTAGA TGTTCAAAGA 
961 GATACTGTTA GAGAGAAAAT TCAAGAGGAT ATTAATGAGA TTAATAAGGA AAAGAATTTG 
1021 CCAAAACCTG GTGATGTAAG TTCTCCTAAA GTTGATAAGC AACTCiCAAAT AAAAGAGAGT 
1081 CTAGAAGATT TGCAGGAGCA GCTTAAAGAA GCTGGTGATG AAAATCAGAA AAGAGAAATT 
1141 GAGAAGCAAA TTGAAATCAA AAAAAGGGAC GAAGAACTTT TAAAAAGTAA AGATGGCAAA 
1201 GTAAGTAAAG ATTATGAAGC ATTAGATCTT GATCGAGAAT TATCCAAAGC TTCTAGTAAA 
1261 GAAAAAAGTA AGGTCAAGGA AGAAGAAATA ACTAAAGGTA AATCACGGGC AAGCTTAGGC 
1321 GATTTGAATA ATGATAAAAA CCTTATGTTG CCAGAAGATC AAAAATTACC TGAAGATAAA 
1381 AAATTGGATA GTAAATTAGA TGGTAAAAAA GAATTTAAAC CAGTTTCTGA GGTTGAAAAA 
1441 TTAGATAAGA TTTCCAAGTC TAATAACAAT GAGGTTGGCA AGTTATCACC ATTAGATAAG 
1501 CCTTCTTATG ATGATATTGA TTCAAAAGAG GAGGTAGATA ATAAAGCTAT TAATTTGCAA 
1561 AAGATCGACC CTAAAGTTAA AGACCAAACT ACTTCTTTGA ATGAAGATTT GGATAAAGAT 
1621 TTGACTACTA TGTCTATAGA TTCCAGCAGT CCTGTATTTC TAGAGGTTAT TGATCCTATT 
1681 ACAAATTTAG GAACCCTGCA GCTTATTGAT TTAAATACTG GGGTTAGGCT TAAGGAAAGC 
1741 ACTCAGCAAG GCATTCAGCG GTATGGAATT TATGAACGTG AAAAAGATTT GGTTGTTATT 
1801 AAAATGGATT CAGGAAAGGC TAAGCTTCAA ATACTTAATA AGCTTGAAAA TTTGAAAGTG 
1861 GTATCAGAGT CTAATTTTGA GATCAATAAA AATTCATCTC TTTATGTTGA CTCTAAAATG 
1921 ATTTTGGCAG CTGTTAGAGA TAAGGATGAT AGCAATGCTT GGAGATTGGC TAAATTTTCT 
1981 CCTAAAAATT TGGATGAGTT TATTCTTTCA GAGAATAAAA TTTTGCCTTT TACTAGCTTT 
2041 TCTGTGAGAA AAAATTTTAT TTATTTGCAA GATGAGCTTA AAAATCTAGT TATTTTAGAT 
2101 GTAAATACTT TAAAAAAAGT TAAGTA 
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10 20 30 40 

« * * * * * « * « 

ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 
TAC TTT TTT ATA AAT AAC CCT TAT CCA GAT TAT AAT CGG AAT TAT CGT 
Met Lys Lys Tyr Leu Leu Gly lie Gly Leu He Leu Ala Leu He Ala> 

50 60 70 80 90 

* « « * •* * * « m 

TGT AAG CAA AAT GTT AGC AGC CTT GAT GAA AAA -AAT AGC GTT TCA GTA - 
ACA TTC GTT TTA CAA TCG TCG GAA CTA CTT TTT TTA TCG CAA AGT CAT 
Cys Lys Gin Asn Val Ser Ser Leu Asp Glu Lys Asn Ser Val Ser Val> 

100 110 120 130 140 

« « « • * • » * * 

GAT TTA CCT GGT GGA ATG ACA GTT CTT GTA AGT AAA GAA AAA GAC AAA 
CTA AAT GGA CCA CCT TAC TGT CAA GAA CAT TCA TTT CTT TTT CTG TTT 
Asp Leu Pro Gly Gly Met Thr Val Leu Val Ser Lys Glu Lys Asp Lys> 

150 160 170 160 190 

« * «** * * * ♦ « 

GAC GGT AAA TAC AGT CTA GAG GCA ACA GTA GAC AAG CTT GAG CTT AAA 
CTG CCA TTT ATG TCA GAT CTC CGT TGT CAT CTG TTC GAA CTC GAA TTT 
Asp Gly Lys Tyr Ser Leu Glu Ala Thr Val Asp Lys Leu Glu Leu Lys> 

200 210 220 230 240 

« « • * • * • * * ♦ 

GGA ACT TCT GAT AAA AAC AAC GGT TCT GGA ACA CTT GAA GGT GAA AAA 
CCT TGA AGA CTA TTT TTG TTG CCA AGA CCT TGT GAA CTT CCA CTT TTT 
Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Thr Leu Glu Gly Glu Lys> 

250 260 270 2B0 

« « * • «« 

ACT GAC AAA AGT AAA GTA AAA TTA ACA ATT GCT GAT GAC CTA AGT CAA 
TGA CTG TTT TCA TTT CAT TTT AAT TGT TAA CGA CTA CTG GAT TCA GTT 
Thr Asp Lys Ser Lys Val Lys Leu Thr He Ala Asp Asp Leu Ser Gln> 

290 300 310 320 330 

* «• * ** • •* • 

ACT AAA TTT GAA ATT TTC AAA GAA GAT GCC AAA ACA TTA GTA TCA AAA 
TGA TTT AAA CTT TAA AAG TTT CTT CTA CGG TTT TGT AAT CAT AGT TTT 
Thr Lys Phe Glu He Phe Lys Glu Asp Ala Lys Thr Leu Val Ser Lys> 

340 350 360 370 . 380 

* * * • * * * * * 

AAA GTA ACC CTT AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAC GAA 
TTT CAT TGG GAA TTT CTG TTC AGT AGT TGT CTT CTT TTT AAG TTG CTT 
Lys Val Thr Leu Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn Glu> 
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Thr He Ser Val Asn Ser Lys Lys Thr Thr Gin Leu Val Phe Thr Lys> 



730 740 
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CAA TAC ACA ATA ACT GTA AAA CAA 
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Gin Tyr Thr lie Thr Val Lys Gin 



750 760 
« • # •* • 

TAC GAC TCC GCA GGT ACC AAT TTA 
ATG CTG AGG CGT CCA TGG TTA AAT 
Tyr Asp Ser Ala Gly Thr Asn Leu> 
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CTA CTG TTT TCA TTT CGT TTT AAT TGT TAA CGA CTG CTA GAT TCA TTT 
Asp Asp Lys Ser Lys Ala Lys Leu Thr He Ala Asp. Asp Leu Ser Lys> 

290 300 310 320 330 

**• ******* 

ACC ACA TTC GAA CTT TTA AAA GAA GAT GGC AAA ACA TTA GTG TCA AGA 
TGG TGT AAG CTT GAA AAT TTT CTT CTA CCG TTT TGT AAT CAC AGT TCT 
Thr Thr Phe Glu Leu Leu Lys Glu Asp Gly Lys Thr Leu Val Ser Arg 

340 350 360 370 380 

AAA GTA AGT TCT AGA GAC AAA ACA TCA ACA GAT GAA ATG TTC AAT GAA 
TTT CAT TCA AGA TCT CTG TTT TGT AGT TGT CTA CTT TAC AAG TTA CTT 
Lys Val Ser Ser Arg Asp Lys Thr Ser Thr Asp Glu Met Phe Asn Glu 
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3 90 400 410 420 42C 

AAA GGT GAA TTG TCT GCA AAA ACC ATG ACA AGA GAA AAT GGA ACC AAA 
TTT CCA CTT AAC AGA CGT TTT TGG TAG TGT TCT CTT TTA CCT TGG TTT 
Lys Gly Glu Leu Ser Ala Lys Thr Met Thr Arg Glu Asn Gly Thr Lys> 

440 450 460 470 480 

• * * * * * * * * 

CTT GAA TAT ACA GAA ATG AAA AGC GAT GGA ACC GGA AAA GCT AAA GAA 
GAA CTT ATA TGT CTT TAC TTT TCG CTA CCT T&f fCT TTT CGA TTT CTT 
Leu Glu Tyr Thr Glu Met Lys Ser Asp Gly Thr Gly Lys Ala Lys Glu> 

490 500 510 520 

******* 

GTT TTA AAA AAG TTT ACT CTT GAA GGA AAA GTA GCT AAT GAT AAA GTA 
CAA AAT TTT TTC AAA TGA GAA CTT CCT TTT CAT CGA TTA CTA TTT CAT 
Val Leu Lys Lys Phe Thr Leu Glu Gly Lys Val Ala Asn Asp Lys Val> 

530 540 550 560 S70 

* * * 

ACA TTG GAA GTA AAA GAA GGA ACC GTT ACT TTA AGT AAG GAA ATT GCA 
TGT AAC CTT CAT TTT CTT CCT TGG CAA TGA AAT TCA TTC CTT TAA CGT 
Thr Leu Glu Val Lys Glu Gly Thr Val Thr Leu Ser Lys Glu lie Ala> 

590 590 600 610 620 

* 

AAA TCT GGA GAA GTA ACA GTT GCT CTT AAT GAC ACT AAC ACT ACT CAG 
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Lys Ser Gly Glu Val Thr Val Ala Leu Asn Asp Thr Asn Thr Thr Gln> 
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lie Ser Val Asn Ser Lys Lys Thr Thr Gin Leu Val Phe Thr Lys Gln> 

730 740 750 760 

* * * • 

TAC ACA ATA ACT GTA AAA CAA TAC GAC TCC GCA GGT ACC AAT TTA GAA 
ATG TGT TAT TGA CAT TTT GTT ATG CTG AGG CGT CCA TGG TTA AAT CTT 
Tyr Thr He Thr Val Lys Gin Tyr Asp Ser Ala Gly Thr Asn Leu Glu> 
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Thr 


GAT 
CTA 
Asp 


TCT 
AGA 
Ser 


AAT 
TTA 
Asn 


GCG 
CGC 
Ala 


GTT TTA 
CAA AAT 
Val Leu 


CTT 
GAA 
Leu 


GCT 
CGA 
Ala 


GTG 
CAC 
Val 


AAA 
TTT 
Lys 


GAG 
CTC 
Glu 


GTT 
CAA 
Val 


GAA 
CTT 
Glu 


GCG 
CGC 
Ala> 




200 




* 


210 
• 


* 


220 
« 


« 


230 
* 






240 
* 


* 

TTG 
AAC 
Leu 


CTG 
GAC 
Leu 


TCA 
AGT 
Ser 


TCT 
AGA 
Ser 


ATA 
TAT 
He 


GAT 
CTA 
Asp 


GAA ATT 
CTT TAA 
Glu He 


GCT 
CGA 
Ala 


GCT 
CGA 
Ala 


AAA 
TTT 
Lys 


GCT 
CGA 
Ala 


* ATT 

TAA 
He 


GGT 
CCA 
Gly 


AAA 
TTT 

Lys 


^ * * 

TTT 
Lys> 




* 


250 
* 


* 




260 
• 


* 


270 
* 




• 


280 
• 







ATA CAC CAA AAT AAT GGT TTG GAT ACC GAA TAT AAT CAC AAT GGA TCA 
TAT GTG GTT TTA TTA CCA AAC CTA TGG CTT ATA TTA GTG TTA CCT AGT 
He His Gin Asn Asn Gly Leu Asp Thr Glu Tyr Asn His Asn Gly Ser> 



290 300 310 320 330 

«♦***•* • * 

TTG TTA GCG GGA CGT TAT GCA ATA TCA ACC CTA ATA AAA CAA AAA TTA 
AAC AAT CGC CCT GCA ATA CGT TAT AGT TGG GAT TAT TTT GTT TTT AAT 
Leu Leu Ala Gly Arg Tyr Ala He Ser Thr Leu He Lys Gin Lys Leu> 

340 350 360 370 380 

«♦* * « • * * * 

GAT GGA TTG AAA AAT GAA GGA TTA AAG GAA AAA ATT GAT GCG GCT AAG 
CTA CCT AAC TTT TTA CTT CCT AAT TTC CTT TTT TAA CTA CGC CGA TTC 
Asp Gly Leu Lys Asn Glu Gly Leu Lys Glu Lys He Asp Ala Ala Lys> 
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B-31 OSP C/ B-31 OSP A/ B-31 OSP B FUSION 



« 


390 




400 
• • 


* 


410 
« 




* 


420 
• 






<: 


:0 


AAA 
TTT 
Lys 


TGT 
ACA 
Cys 


TCT 
AGA 
Ser 


GAA ACA 
CTT TGT 
Glu Thr 


TTT 
AAA 
Phe 


ACT 
TGA 
Thr 


AAT 
TTA 
Asn 


AAA 
TTT 
Lys 


TTA 
AAT 
Leu 


AAA 
TTT 
Lys 


GCA 
CGT 
Ala 


AAA 
TTT 
Lys 


CAC 
GTG 
His 


ACA 
TGT 
Thr 


GAT 
CTA 
Asp> 


* 


440 
* 


• 


450 




* 


460 
• 


« 


470 
* 




• 


480 
* 


CTT GGT AAA 
GAA CCA TTT 
Leu Gly Lys 


GAA GGT 
CTT CCA 
Glu Gly 


GTT 
CAA 
Val 


ACT 
TGA 
Thr 


GAT 
CTA 
Asp 


GCT 
CGA 
Ala 


GAT 
CTA 
Asp 


GCA AAA 
CGT TTT 
Ala""i:ys 


GAA 
CTT 
Glu 


GCC 
CGG 
Ala 


ATT 
TAA 
He 


TTA 
AAT 
leu> 




• 


490 

* * 


500 
* 




• 


510 




• 


520 


"* • 




AAA 
TTT 

Lys 


ACA 
TGT 
Thr 


AAT 
TTA 
Asn 


GGT ACT 
CCA TGA 
Gly Thr 


AAA 
TTT 
Lys 


ACT 
TGA 
Thr 


AAA 
TTT 
Lys 


GGT 
CCA 
Gly 


GCT 
CGA 
Ala 


GAA 
CTT 
Glu 


GAA 
CTT 

g;u 


CTT GGA 
GAA CCT 
Leu Gly 


AAA 
TTT 
Lys 


TTA 
AAT 
leu> 


530 
* 






S40 
* 


• 


550 
* 


♦ 


560 
• 




• 


570 
i * 




• 


TTT 
AAA 
Phe 


GAA 
CTT 
Glu 


TCA GTA GAG 
AGT CAT CTC 
Ser Val Glu 


GTC 
CAG 
Val 


TTG 
AAC 
Leu 


TCA 
AGT 
Ser 


AAA 
TTT 
Lys 


GCA 
CGT 
Ala 


GCT 
CGA 
Ala 


AAA 
TTT 

Lys 


GAG 
CTC 
Glu 


ATG 
TAC 
Met: 


CTT 
GAA 
Leu 


GCT 
CGA 
Ala> 


580 


• 


S90 






600 
* 




* 


610 
* 


* 


620 




AAT 
TTA 
Asn 


TCA 
AGT 
Ser 


GTT 
CAA 
Val 


AAA GAG 
TTT CTC 
Lys Glu 


CTT 
GAA 
Leu 


ACA 
TGT 
Thr 


AGC 
TCG 
Ser 


CCT 
GGA 
Pro 


GTT GTG 
CAA CAC, 
Val Val 


GCA 
CGT 
Ala 


GAA 
CTT 
Glu 


AGT 
TCA 
Ser 


CCA 
GGT 
Pro 


AAA 
TTT 
Lys> 


■m 


630 
* 




640 
* * 


* 


650 
« 






660 




* 


670 
• 


AAA 
TTT 
Lys 


CCT 
GGA 
Pro 


AAG 
TTC 
Lys 


CAA AAT 
GTT TTA 
Gin Asn 


GTT 
CAA 
Val 


AGC 
TCG 
Ser 


AGC 
TCG 
Ser 


CTT 
GAA 
Leu 


GAC GAG 
CTG CTC 
Asp Glu 


AAA 
TTT 
Lys 


AAC 
TTG 
Asn 


AGC 
TCG 
Ser 


GTT 
CAA 
Val 


TCA 
AGT 
Ser> 


• 




680 
* 


« 


690 
* 




• 


700 
• 


« 


710 




* 


720 


GTA 
CAT 
Val 


GAT 
CTA 
Asp 


TTG 
AAC 
Leu 


CCT GGT 
GGA CCA 
Pro Gly 


GAA 
CTT 
Glu 


ATG 
TAC 
Met 


AAA 
TTT 
Lys 


GTT 
CAA 
Val 


CTT GTA 
GAA CAT 
Leu Val 


AGC 
TCG 
Ser 


AAA 
TTT 
Lys 


GAA 
CTT 
Glu 


AAA 
TTT 
Lys 


AAC 
TTG 
Asn> 




* 


730 

* * 




740 
• 




• 


750 
« 




* 


760 
* 


* 





AAA GAC GGC AAG TAC GAT CTA ATT GCA ACA GTA GAC AAG CTT GAG CTT 
TTT CTG CCG TTC ATG CTA GAT TAA CGT TGT CAT CTG TTC GAA CTC GAA 
Lys Asp Gly Lys Tyr Asp Leu lie Ala Thr Val Asp Lys Leu Glu Leu> 
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B-31 OSP C/ B-31. OSP A/ B-31 OSP B FUSION 



770 ™0 



790 . 800 810 



AAA GGA ACT TCT GAT AAA AAC AAT GGA TCT GGA GTA CTT GAA GGC GTA 
TTT CCT TGA AGA CTA TTT TTG TTA CCT AGA CCT CAT GAA CTT CCG CAT 
Lys Gly Thr Ser Asp Lys Asn Asn Gly Ser Gly Val Leu Clu Gly Val> 



820 



830 840 650 860 



AAA GCT GAC AAA AGT AAA GTA AAA TTA ACA ATT TCT GAC GAT CTA GGT 
TTT CGA CTG TTT TCA TTT CAT TTT AAT TGT 'TAA 'AG A CTC- CTA GAT CCA 
Lys Ala Asp Lys Ser Lys Val Lys Leu Thr He Ser Asp Asp Leu Gly> 



870 



880 890 900 ' SIO 



CAA ACC ACA CTT GAA GTT TTC AAA GAA GAT GGC AAA ACA CTA GTA TCA 
GTT TGG TGT GAA CTT CAA AAG TTT CTT CTA CCG TTT. TGT GAT CAT AGT 
Gin Thr Thr Leu Clu Val Phe Lys Glu Asp Gly Lys Thr Leu Val Ser> 

930 940 950 960 



920 



AAA AAA GTA ACT TCC AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAT 
TTT TTT CAT TGA AGG TTT CTG TTC AGT AGT TGT CTT CTT TTT AAG TTA 
Lys Lys Val Thr Ser Lys Asp Lys Ser Ser Thr Glu Glu Lys Phe Asn> 



970 



980 990 1000 



GAA AAA GGT GAA GTA TCT GAA AAA ATA ATA ACA AGA GCA GAC GGA ACC 
CTT TTT CCA CTT CAT AGA CTT TTT TAT TAT TGT TCT CGT CTG CCT TGG 
Glu Lys Gly Glu Val Ser Glu Lys lie lie Thr Arg Ala Asp Gly Thr> 

1010 1020 1030 1040 1050 

\ 

AGA CTT GAA TAC ACA GGA ATT AAA AGC GAT GGA TCT GGA AAA GCT AAA 
TCT GAA CTT ATG TGT CCT TAA TTT TCG CTA CCT AGA CCT TTT CGA TTT 
Arg Leu Glu Tyr Thr Gly He Lys Ser Asp Gly Ser Gly Lys Ala Lys> 

1060 1070 ^ 1080 ^ 1090 ^ 1100 

GAG GTT TTA. AAA GGC TAT GTT CTT GAA GGA ACT CTA ACT GCT GAA AAA 
CTC CAA AAT TTT CCG ATA CAA GAA CTT CCT TGA GAT TGA CGA CTT TTT 
Glu Val Leu Lys Gly Tyr Val Leu Glu Gly Thr Leu Thr Ala Glu Lys> 



1110 1120 



1130 1140 H 50 



ACA ACA TTG GTG GTT AAA GAA GGA ACT GTT ACT TTA AGC AAA AAT ATT 
T £ TGT AAC CAC CAA TTT CTT CCT TGA CAA TGA AAT TCG TTT TTA TAA 
Thr Thr Leu Val Val Lys Glu Gly Thr Val Thr Leu Ser Lys Asn Ile> 
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46/33 

B-31 OSP C / B-31 OSP A , B-31 OSP B nsZO* 



TCA AAA TCT GGG GAA GTT TCA GTT GAA C77 AAT GAC ACT Car * 
AGT TTT AGA CCC CTT CAA AG7 CAA C77 GAA TTA C7G Tel £ irl ^ 
Ser Lys Ser cly Glu v«l Ser Val Glu Leu Asn Asp JS A^ Lr 

• . 1220 12<0 

GCT GCT ACT AAA AAA ACT GCA OCT TGG AAT GAC ir- ,~r k „ 

CGA CGA TGA TTT TTT TGA CGT cS lit £ A TCA t"£ irr ^ " A 

Ala Ala Thr Lys Lys Thr Ala Ala Trp ™ J£ ^ «J ^ 



1250 1260 1270 12S0 



1290 



ACA ATT AGT GCT GAC AGC AAA AAA ACT AAA GA^ TC GTr tt^ ^ ' 
TGT TAA TCA CGA CTG TCG TTT TTT TGA TTT CT* ~C I! TTA ACA 

Thr lie Ser Ala Asp S .r Lys Lys £ £ £ £ Sf £ £ ™ 



1300 1310 1320 • 2230 



1340 



GAT GGT ACA ATT ACA GTA CAA CAA TAC AAC ACA GC~ rrl * 

CTA CCA TGT TAA TGT CAT GTT GTT ATG TTG tS CGA OCT t£ J£ £J 

Asp Gly Thr lie Thr Val Gin Gin Tyr Asn Thr AU Gly ITr IT: Lei 



1350 1370 1380 



1350 



GAA GGA TCA GCA AGT GAA ATT AAA AAT CTT TCA r«r /~t~t> »I. 

CTT CCT ACT CGT TCA CTT TAA „ „I SJ £ « SI £ £S ST 

«» «y Ser AU Ser G!u :i. Lys A S „ L, u Ser 0!„ lyl ™ AU> 



1400 
* « 

TTA AAA TAA 
AAT TTT ATT 
Leu Lys *•*> 
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FUSION SEQUENCE 



o r//35 



B-31 OSP A/ 8-31 P-93 (1168-2100) 
Sequence Range: 1 to 1720 





• 


10 * 
• 


• 




20 
• 




• 


30 
* 




• 




40 
• 


* 




AAG 
TTC 
K 


CAA 
GTT 
Q 


AAT 
TTA 
N 


GTT 
CAA 
V 


AGC 
TCG 
S 


AGC 
TCG 
S 


CTT 
GAA 
L 


GAC 
CTG 
D 


GAG 
CTC 
E 


AAA 
TTT 
K 


AAC 
TTG 
N 


AGC 
TCG 
S 


GTT 
CAA 
V 


TCA 
AGT 
S 


GTA 
CAT 
■ V 


GAT 
CTA 
D> 


50 
* 




* 


60 
* 




* 


70 
* 


• 




* 


• 


90 
« 




* 


TTG 
AAC 
L 


CCT 
GGA 
P 


GGT 
CCA 
G 


GAA 
CTT 
E 


ATG 
TAC 
M 


AAA 
TTT 
K 


GTT 
CAA 
V 


CTT 
GAA 
L 


GTA 
CAT 
V 


AGC 
TCG 
S 


AAA 
TTT 
K 


GAA 
CTT 
E 


AAA 
TTT 
K 


AAC 
TTG 
N 


AAA GAC 
TTT. CTG 
K D> 


100 
* 


•* 


110 




• 


120 




« 


130 




140 




GGC 
CCG 
G 


AAG 
TTC 
K 


TAC 
ATG 
Y 


GAT 
CTA 
D 


CTA 
GAT 
L 


ATT 
TAA 
I 


GCA 
CGT 
A 


ACA 
TGT 
T 


GTA 
CAT 
V 


GAC 
CTG 
D 


AAG 
TTC 
K 


CTT 
GAA 
L 


GAG 
CTC 
E 


CTT 
GAA 
L 


AAA 
TTT 
K 


GGA 
CCT 
G> 


* 


150 
* 




• 


160 
« 


* 


170 
• 




* 


180 






190 


ACT 
TGA 
T 


TCT 
AGA 
S 


GAT 
CTA 
D 


AAA 
TTT 
K 


AAC 
TTG 
N 


AAT 
TTA 
N 


GGA 
CCT 
G 


TCT 
AGA 
S 


GGA 
CCT 
G 


GTA 
CAT 
V 


CTT 
GAA 
L 


GAA 
CTT 
E 


GGC 
CCG 
G 


GTA 
CAT 
V 


AAA 
TTT 
K 


GCT 
CGA 
A> 




200 
• 




• 


210 
♦ 




• 


220 


* 


230 






240 


GAC 
CTG 
0 


AAA 
TTT 
K 


AGT 
TCA 
S 


AAA 
TTT 
K 


GTA 
CAT 
V 


AAA 
TTT 
K 


TTA 
AAT 
L 


ACA 
TGT 
T 


ATT 
TAA 
I 


TCT 
AGA 
S 


GAC 
CTG 
D 


GAT 
CTA 
D 


CTA 
GAT 
L 


GGT 
CCA 
G 


* 

CAA 
GTT 
Q 


• 

ACC 
TGG 
T> 




• 


250 
* 


• 


260 
• 




* 


270 
• 




* 


280 
• 


* 




ACA 
TGT 
T 


CTT 
GAA 
L 


GAA 
CTT 
E 


GTT 
CAA 
V 


TTC 
AAG 
F 


AAA 
TTT 
K 


GAA 
CTT 
E 


GAT 
CTA 
D 


GGC 
CCG 
G 


AAA 
TTT 
K 


ACA 
TGT 
T 


CTA 
GAT 
L 


GTA 
CAT 
V 


TCA 
AGT 
S 


AAA 
TTT 
K 


AAA 
TTT 
K> 


290 




* 


300 
• 




« 


310 
* 


• 


- 320 






330 






GTA 
CAT 
V 


ACT 
TGA 
T 


TCC 
AGG 
S 


AAA 
TTT 
K 


GAC 
CTG 
D 


AAG 
TTC 
K 


TCA 
AGT 
S 


TCA 
AGT 
S 


ACA 
TGT 
T 


GAA 
CTT 
E 


GAA 
CTT 
E 


AAA 
TTT 
K 


TTC 
AAG 
F 


* 

AAT 
TTA 
N 


GAA 
CTT 
E 


* 

AAA 
TTT 
K> 


340 
« 




350 
• 




• 


360 
* 




« 


370 


• 


380 




GGT 
CCA 
G 


GAA 
CTT 
E 


GTA 
CAT 
V 


TCT 
AGA 
S 


GAA 
CTT 
E 


AAA 
TTT 
K 


ATA 
TAT 
I 


ATA 
TAT 
I 


ACA 
TGT 
T 


AGA 
TCT 
R 


GCA 
CGT 
A 


GAC 
CTG 
D 


GGA 
CCT 
G 


•ACC 
TGG 
T 


AGA 
TCT 
R 


CTT 
GAA 
L> 
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« 


390 
• 






400 
* 




410 
* 






420 
• 




* 


430 
* 


GAA 
CTT 
E 


TAC 
ATG 
Y 


ACA 
TGT 
T 


GGA 
CCT 
G 


ATT 
TAA 
I 


AAA 
TTT 
K 


AGC 
TCG 
S 


GAT 
CTA 
D 


GGA 
CCT 
G 


TCT 
AGA 
S 


GGA 
CCT 
G 


AAA 
TTT 
K 


GCT 
CGA 
A 


AAA 

ill 

K 


GAG 
CTC 
E 


GTT 
CAA 
V> 


« 


440 
* 




* 


450 
* 




* 


460 
* 


• 


470 




* 


480 


TTA 
AAT 
L 


AAA 
TTT 
K 


GGC 
CCG 
G 


TAT 
ATA 
Y 


GTT 
CAA 
V 


CTT 
GAA 
L 


GAA 
CTT 
E 


GGA 
CCT 
G 


ACT 
TGA 
T 


CTA 
GAT 

L , 


ACT GCT 
TGA CGA 
T^A> 


GAA 
CTT 
E 


AAA 
TTT 
K 


ACA 

4 V* 1 

T 


ACA 
T> 




• 


490 
* 




500 
• 




• 


510 
* 




« 


520 
• 


* 




TTG 
AAC 
L 


GTG 
CAC 
V 


GTT 
CAA 
V 


AAA 
TTT 
K 


GAA 
CTT 
E 


GGA 
CCT 
G 


ACT 
TGA 
T 


GTT 
CAA 
V 


ACT 
TGA 
T 


TTA 
AAT 
L 


AGC 
TCG 
S 


AAA 
TTT 
K 


AAT 
TTA 
N 


ATT 
TAA 
I 


TCA 
AGT 
S 


AAA 
TTT 

K> 


530 
« 






540 




• 


550 
* 


• 


560 
• 




• 


S70 






TCT 
AGA 
S 


GGG 
CCC 
G 


GAA 
CTT 
■ E 


GTT 
CAA 
V 


TCA 
AGT 
S 


GTT 
CAA 
V 


GAA 
CTT 
E 


CTT 
GAA 
L 


AAT 
TTA 
N 


GAC 
CTG 
D 


ACT 
TGA 
T 


GAC 
CTG 
D 


AGT 
TCA 
S 


• 

AGT 
TCA 
S 


GCT 
CGA 
A 


GCT 
CGA 
A> 


580 
• 


* 


590 
« 




* 


600 
* 




* 


610 
* 


* 


620 




ACT 
TGA 
T 


AAA 
TTT 
K 


AAA 
TTT 
K 


ACT 
TGA 
T 


GCA 
CGT 
A 


GCT 
CGA 
A 


TGG 
ACC 
W 


AAT 
TTA 
N 


TCA 
AGT 
S 


GGC 
CCG 
G 


ACT 
TGA 
T 


TCA 
AGT 
S 


ACT 
TGA 
T 


TTA 
AAT 
L 


ACA 
TGT 
T 


ATT 
TAfc 
I> 




630 




• 


640 
* 


• 


650 
* 




* 


660 




* 


670 
• 


ACT 
TGA 
T 


GTA 
CAT 
V 


AAC 
TTG 
N 


AGT 
TCA 
S 


AAA 
TTT 
K 


AAA 
TTT 
K 


ACT 
TGA 
T 


AAA 
TTT 
K 


GAC 
CTG 
D 


CTT 
GAA 
L 


GTG 
CAC 
V 


TTT 
AAA 
F 


ACA 
TGT 
T 


AAA 
TTT 
K 


GAA 
CTT 
E 


AAC 
TTG 
N> 


• 


680 
* 




* 


690 
* 




* 


700 
« 




710 
• 




« 


720 
• 


ACA 
TGT 
T 


ATT 
TAA 
I 


ACA 
TGT 
T 


GTA 
CAT 
V 


CAA 
GTT 
Q 


CAA 
GTT 
Q 


TAC 
ATG 
Y 


GAC 
CTG 
D 


TCA 
AGT 
S 


AAT 
TTA 

-N 


GGC 
CCG 
G 


ACC 
TGG 
T 


AAA 
TTT 
K 


TTA 
AAT 
L 


GAG 
CTC 
E 


GGG 
CCC 
G> 




• 


730 
* 


• 


740 
• 




• 


750 
« 




• 


760 
* 


« 




TCA 
AGT 
S 


GCA 
CGT 
A 


GTT 
CAA 
V 


GAA 
CTT 
E 


ATT 
TAA 
I 


ACA 
TGT 
T 


AAA 
TTT 
K 


CTT 
GAA 
L 


GAT 
CTA 
D 


GAA 
CTT 
E 


ATT 
TAA 
I 


AAA 
TTT 
K 


AAC 
TTG 
N 


GCT 
CGA 
A 


TTA 
AAT 
L 


AAA 
TTT 
K> 
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770 



780 ^90 800 810 



GGT CAC CCC ATG GAT GAA AAG CTT TTA AAA ACT AAA GAT GAT AAA GCA 
CCA GTG GGG TAC CTA CTT TTC GAA AAT TTT TCA TTT CTA CTA TT7 CGT 
GH PMDEKLLKSKDDKA> 



820 



830 840 850 860 



AGT AAA GAT GGT AAA GCC TTG GAT CTT GAT CGA GAA TTA AAT TCT AAA 
TCA TTT CTA CCA TTT CGG AAC CTA GAA CTA C£T CTT AAT TTA AGA TTT 
S K D G K A L D L D fU--_--E; L N S K> 

870 880 890 900 " 910 

***** 

GCT TCT AGC AAA GAA AAA AGT AAA GCC AAG GAA GAA GAA ATA ACC AAG 
CGA AGA TCG TTT CTT TTT TCA TTT CGG TTC CTT CTT CTT TAT TGG TTC 
ASSKEKSKAKEEEITK> 

Q 2 o 930 940 950 960 

- * . * • • * * • 

GGT AAG TCA CAG AAA AGC TTA GGC GAT TTG AAT AAT GAT GAA AAT CTT 
CCA TTC AGT GTC TTT TCG AAT CCG CTA AAC TTA TTA CTA CTT. TTA GAA 
S Q K S LGDLNN D E N L> 



G K 



970 980 990 1000 



ATG ATG CCA GAA GAT CAA AAA TTA CCT GAG CTT AAA AAA TTA GAT AGC 
TAC TAC GGT CTT CTA GTT TTT AAT GGA CTC CAA TTT TTT AAT CTA TCG 
M MPEDQKLP EVKKLDS> 

10 lO 1020 1030 1040 1050 

AAA AAA GAA TTT AAA CCT GTT TCT GAG GTT GAG AAA TTA GAT AAG ATT 
TTT TTT CTT AAA TTT GGA CAA AGA CTC CAA CTC TTT AAT CTA TTC TAA 
K KEFKPVSEVEKLDKI> 



1060 



1070 1080 1090 1100 



TTC AAG TCT AAT AAC AAT GTT GGA GAA TTA TCA CCG TTA GAT AAA TCT 
AAG TTC AGA TTA TTG TTA CAA CCT CTT AAT AGT GGC AAT CTA TTT AGA 
FKSNNNVGELSPLDKS> 

UiO 1120 1130 1140 1150 

TCT TAT AAA GAC ATT GAT TCA AAA GAG GAG ACA GTT AAT AAA GAT GTT 
AGA ATA TTT CTG TAA CTA AGT TTT CTC CTC TGT CAA TTA TTT CTA CAA 
SY KDIDS KEETVNK D V> 
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1 OSP / B-31 P-93 

1160 .1170 1180 

4 * * • • « 

AAT TTG CAA AAG ACT AAG CCT CAG GTT AAA 
TTA AAC GTT TTC TGA TTC GGA GTC CAA TTT 
NLOKTKPQVK 



1190 1200 
• « • * 

GAC CAA GTT ACT TCT TTG 

CTG GTT CAA TGA AG A AAC 

D Q V T S L> 



1210 1220 1230 1240 

« « « • • w « 

AAT GAA GAT TTG ACT ACT ATG TCT ATA GAT TCC AGT AGT CCT GTA TTT 
TTA CTT CTA AAC TGA TGA TAC AGA TAT CTA AGG JTCA TCA GGA CAT AAA 
N ED LT TM S I P"S S~ * S P V F> 



1250 1260 1270 • 

• « * * « 

TTA GAG GTT ATT GAT CCA ATT ACA 
AAT CTC CAA TAA CTA GGT TAA TGT 
LEVIDPIT 



1280 1290 
* • * * « 

AAT TTA GGA ACT CTT CAA CTT ATT 
TTA AAT CCT TGA GAA GTT GAA TAA 

nlgtlql:> 



1300 1310 1320 1330 1340 

GAT TTA AAT ACT GGT GTT AGG CTT AAA GAA AGC ACT CAG CAA GGC ATT 

CTA AAT TTA TGA CCA CAA TCC GAA TTT CTT TCG TGA GTC GTT CCG TAA 

DLNTGVRLKESTQQG X> 



1350 1360 1370 1380 1390 

« * ••**«« «« 

CAG CGG TAT GGA ATT TAT GAA CGT GAA AAA GAT TTG GTT GTT ATT AAA 
GTC GCC ATA CCT TAA ATA CTT GCA CTT TTT CTA AAC CAA CAA TAA TTT 
QR YGIYEREKDLVV1K> 



1400 1410 
• • « • * 

ATG GAT TCA GGA AAA GCT AAG CTT 
TAC CTA AGT CCT TTT CGA TTC GAA 
MDSGKAKL 



1420 1430 144G 

• * * mm 

CAG ATA CTT GAT AAA CTT GAA AAT 
GTC TAT GAA CTA TTT GAA CTT TTA 
OILDKLEN> 



1450 

TTA AAA GTG GTA 
AAT TTT CAC CAT 
L K V V 



1460 
• * 

TCA GAG TCT AAT 
AGT CTC AGA TTA 
S E S N 



1470 
* • • 

TTT GAG. ATT AAT 
AAA CTC TAA TTA 
FEIN 



1480 

« * 

AAA AAT TCA TCT 
TTT TTA AGT AGA 
K N S S> 



1490 1500 1510 1520 1530 

« •«« * * * • « « 

CTT TAT GTT GAT TCT AAA ATG ATT TTA GTA GCT GTT AGG GAT AAA GAT 
GAA ATA CAA CTA AGA TTT TAC TAA AAT CAT CGA CAA TCC CTA TTT CTA 
L Y V DS K M I LV AV R DK D> 
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1540 1550 1560 1570 1580 

ACT AGT AAT GAT TGG AGA TTG GCC AAA TTT TCT CCT AAA AAT TTA GAT 
TCA TCA TTA CTA ACC TCT AAC CGG TTT AAA AGA GGA TTT TTA AAT CTA 
SSNDWRLAKFSPKNLb> 

1590 1600 1610 1620 1630 

GAG TTT ATT CTT TCA GAG AAT AAA ATT ATG CCT TTT ACT AGC TTT TCT 
CTC AAA TAA GAA AGT CTC TTA TTT TAA TAC GGA JVAA.TGA TCG AAA AGA 
EFILSENKIM A PFTSFS> 

1640 1650 1660 1670 1660 

* • « * * • * * #M 

GTG AGA AAA AAT TTT ATT TAT TTG CAA GAT GAG TTT AAA AGT CTA GTT 
CAC TCT TTT TTA AAA TAA ATA AAC GTT CTA CTC AAA TTT TCA GAT CAA 
V RKNF I YLQDE FKS LV> 

1690 4 1700 1710 1720 

* *• * ** * * 

ATT TTA GAT GTA AAT ACT TTA AAA AAA GTT AAG GGT CAC C 
TAA AAT CTA CAT TTA TGA AAT TTT TTT CAA TTC CCA GTG G 
I LDVNTLKKVKGH X> 
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B-31 OSP 8/ 6-31 P41 (122-234) 

OSPB/Flal22-234 

Sequence Range: 1 to 1180 

10 20 30 <o 

* * * * 

GCA CAA AAA GGT GCT GAG TCA ATT GGT TCT CAA AAA GAA AAT GAT CTA 
CGT GTT TTT CCA CGA CTC AGT TAA CCA AGA GTT TTT CTT TTA CTA GAT 

*OKGAESIGSQKENDL> 



50 60 70 80* - 



90 



AAC CTT GAA GAC TCT AGT AAA AAA TCA CAT CAA AAC GCT AAA CAA GAC 
TTG GAA CTT CTG AGA TCA TTT TTT AGT GTA GTT TTG CGA TTT CTT CTG 
NLEDSSKKSHQNAKQ D > 

100 UO 120 130 i<o 

; 

CTT CCT GCG GTG ACA GAA GAC TCA GTG TCT TTG TTT AAT GGT AAT AAA 
GAA GGA CGC CAC TGT CTT CTG AGT CAC AGA AAC AAA TTA CCA TTA TT^ 
Lp AVTEDSVSLFNGNK> 



ISO 160 170 180 



190 



ATT TTT GTA AGC AAA GAA AAA AAT AGC TCC GGC AAA TAT GAT TTA AGA 
TAA AAA CAT TCG TTT CTT TTT TTA TCG AGG CCG TTT ATA CTA AAT TCT 
1FVSKEKNSSGKYD. L R> 

200 210 220 230 240 

* 

GCA ACA ATT GAT CAG GTT GAA CTT AAA GGA ACT TCC GAT AAA AAC AAT 
CGT TGT TAA CTA GTC CAA CTT GAA TTT CCT TGA AGG CTA TTT TTG TTA 
A TIDQVELKGTSDKNN> 

250 260 270 280 

GGT TCT GGA ACC CTT GAA GGT TCA AAG CCT GAC AAG AGT AAA GTA AAA 
CCA AGA CCT TGG GAA CTT CCA AGT TTC GGA CTG TTC TCA TTT CAT TTT 
GSGTLEGSKPDKSKVK> 



290 300 310 320 



330 



TTA ACA GTT TCT GCT GAT TTA AAC ACA GTA ACC TTA GAA GCA TTT GAT 
AAT TGT CAA AGA CGA CTA AAT TTG TGT CAT TGG AAT CTT CGT AAA CTA 
LTVSADLNTVTLEA.FD> 

340 350 360 370 380 

* 

GCC AGC AAC CAA AAA ATT TCA AGT AAA GTT ACT AAA AAA CAG GGG TCA 
CGG TCG TTG GTT TTT TAA AGT TCA TTT CAA TGA TTT TTT GTC CCC AGT 
A SNQK I SS KVT KKQGS> 
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B-31 OSP B/ B-31 P41 (122-234) 

390 400 410 420 430 

* • * ««««*« 9 

ATA ACA GAG GAA ACT CTC AAA GOT AAT AAA TTA GAC TCA AAG AAA TTA 
TAT TGT CTC CTT TGA GAG TTT CGA TTA TTT AAT CTG AGT TTC TTT AAT 
I T-E ET L KANK L D S K. K L> 

440 450 460 470 <£0 

* * * • * * * • m m 

ACA AGA TCA AAC GGA ACT ACA CTT GAA TAC TCA CAA ATA ACA GAT GC7 
TGT TCT AGT TTG CCT TGA TGT GAA CTT ATG AGT CTT TAT TGT CTA CGA 
T R S N G T T L E Y " S ~ V I T D A> 
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* 
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« 
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770 780 790 800 810 

ACA GCT GGA ACC AGC CTA GAA GGA TCA GCA AGT GAA ATT AAA AAT CTT 
TGT CGA CCT TGG TCG GAT CTT CCT AGT CGT TCA CTT TAA TTT TTA GAA 
TAGT SLEGSAS.EIKNL> 



820 830 840 8S0 



860 



TCA GAG CTT AAA AAC GCT TTA AAA GGT CAC CCC ATG GCT CAA TAT AAC 
AGT CTC GAA TTT TTG CGA AAT TTT CCA GTS GGG TAC CGA GTT ATA TTG 
s E L K N A L K G H,. P=- -W- A Q y N> 



870 880 890 900 



910 



CAA ATG CAC ATG TTA TCA AAC AAA TCT GCT TCT CAA AAT GTA AGA ACA 
GTT TAC GTG TAC AAT AGT TTG TTT AGA CGA AGA GTT TTA CAT TCT TGT 
Q"HM LSNKSASQ*NVRT> 



920 930 940 950 



96. 



GCT GAA GAG CTT GGA ATG CAG CCT GCA AAA ATT AAC ACA CCA GCA TCA 
CGA CTT CTC GAA CCT TAC GTC GGA CGT TTT TAA TTG TGT GGT CGT AGT 
AEE LGMQPAK INTPAS> 

970 980 990 1000 

CTT TCA GGG CTT CAA GCG TCT TGG ACT TTA AGA GTT CAT GTT GGA GCA 
GAA AGT CCC GAA GTT CGC AGA ACC TGA AAT TCT CAA GTA CAA CCT CGT 
LSGLQASWTLRVHVGA> 

1010 1020 1030 1040 1050 

* 

ACC CAA GAT GAA GCT ATT GCT GTA AAT ATT TAT GCA GCT AAT GTT GCA 
TGG GTT CTA CTT CGA TAA CGA CAT TTA TAA ATA CGT CGA TTA CAA CGT 
T « DE AIAVNIYAANVA> 



1060 1070 1080 1090 



1100 



AAT CTT TTC TCT GGT GAG GGA GCT CAA ACT GCT CAG GCT GCA CCG GTT 
TTA GAA AAG AGA CCA CTC CCT CGA GTT- TGA CGA GTC CGA CGT GGC CAA 
NLFSGEGAQT AQAAPV> 

1110 1120 1130 1140 1150 

* 

CAA GAG GGT GTT CAA CAG GAA GGA GCT CAA CAG CCA GCA CCT GCT ACA 
GTT CTC CCA CAA GTT GTC CTT CCT CGA GTT GTC GGT CGT GGA CGA TGT 
OEGVQQEGAQQPAPAT> 
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B-31 OSP B/ B-31 P41 (122-234) 

U60 U70 ii 8 o 
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GCA CCT TOP CAA GGC GGA GTT GGT CAC C 
CGT GGA AGA GTT CCG CCT CAA CCA GTG G 
A p SQGGVGHX> 
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Sequence Range: 1 to 1363 
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770 780 790 80*0 810 

* « • * * * « «« ^ 

ACA GCT GGA ACC AGC CTA GAA GGA TCA GCA ACT GAA ATT AAA AAT CTT 
TGT CGA CCT TGG TCG GAT CTT CCT AGT CGT TCA CTT TAA TTT TTA GAA 
TAGTSLEGSASEIKNL> 

820 830 840 850 860 

* * * *•# ## ^ 

• TCA GAG CTT AAA AAC GCT TTA AAA GGT CAC CCC ATG GCT CAA TAT AAC 
AGT CTC GAA TTT TTG CGA AAT TTT CCA GTG GGG TAC CGA GTT ATA TTG 
SELKNALKGHPMAQYK> 



870 880 890 900 



910 



CAA ATG CAC ATG TTA TCA AAC AAA TCT GCT TCT CAA AAT GTA AGA ACA 
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( 1832 ] a ... gcc c - 9 9* 

2. OspC-280 290 300 310 ' 320 

( 1786 ] ... a.. ..a ... gc a aa c > 

3. Os P C-K4 290 300 310 320 330 

[ 1774 ] a ... gcc c * 9 ^ 



340 



350 360 370 380 



OSPC-B31 * GAT GGA TTG AAA AAT GAA GGA TTA AAG GAA AAA ATT GAT GCG GCT AAG 
P CTA CCT AAC TTT TTA CTT CCT AAT TTC CTT TTT TAA CTA CGC CGA TTC 

ttt 

1. OspC-340 350 j 360 370 380 390 

[ 1832 ] ag. aa a ... .a ac. g ca aa * 

2. OspC-TR330 340 350 360 370 

[ 1786 J ag. .t t tea ... .a a a. a .a 

ttc 
i 

3. OspC-K4 340 ! 350 360 370 380 ^ 
[ 1774 ] ag. aa a ... .ag t a a .a 

390 400 410 420 430 

********* 
OSDC-B31 AAA TGT TCT GAA ACA TTT ACT AAT AAA TTA AAA GAA AAA CAC ACA GAT 
U5pv - ° ° £££ ACA AGA CTT TGT AAA TGA TTA -TTT AAT TTT CTT TTT GTG TGT CTA 

1. OspC-PK 400 410 420 430 

[ 1832 ] c ... ga c agt ggt ..t g > 

2. OspC-TR 380 390 400 410 420 

[ 1786 ] g.t c ... .a c. ..g c t .gt ..t g. . ..g> 

3. OspC-K4 390 400 410 420 

[ 1774 ] ..c ca g 9. c gt tct ..t g.. c.a> 



440 



450 460 470 480 
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OapC-B31 CTT GGT AAA GAA GGT GTT ACT GAT GCT GAT GCA AAA GAA GCC ATT TTA 
GAA CCA TTT CTT CCA CAA TGA CTA CGA CTA CGT TTT CTT CGG TAA AAT 

1. OspC-P440 450 460 470 480 

[ 1832 ] c ... eg .a. .c. ..c ... .a. c c. ..t > 

2. OspC-TR 430 440 450 460 470 

[ 1786 ] t. c. a.c ... cag ... .a. a a.. ..t > 

3. OspC-430 440 450 460 470 

I 1774 J a gtt .ct .c. .c a. c t > 

490 500 510 520 

* * * *** * * * 

OSPC-B31 AAA ACA AAT GGT ACT AAA ACT AAA GGT GCT GAA GAA CTT GGA AAA TTA 
TTT TGT TTA CCA TGA TTT TGA TTT CCA OGA CTT CTT GAA CCT TTT AAT 

1. OspC-PK 490 500 510 520 530 

[ 1832 ] c. .ca ... .cc ga a t.. aa. g.t ...> 

2. OspC-TR 480 490 500 510 

I 1786 ] c. ..a gac ..g a a. g.g 

3. OspC-K4480 490 500 510 520 

[ 1774 ] ..g t cc ga. ..g a., -c. ... aa. g.c ...> 

530 540 550 560 570 

* * * * * * * * , * * 

OepC-B31 TTT GAA TCA GTA GAG GTC TTG TCA AAA GCA GCT AAA GAG ATG CTT GCT 
AAA CTT AGT CAT CTC CAG AAC AGT TTT CGT CGA TTT CTC TAC GAA CGA 

1. OspC-PK 540 550 560 570 580 

[ 1832 ] a .gt ... .t c . . . ta gca ..a a..> 

2. OspC-520 530 540 550 560 

( 1786 J a c. ..aag g c. .ca gca t.a a..> 

3. OspC-K4 530 540 550 560 570 

[ 1774 ] .c a ag. ... g g c.. ..a gca t.a ...> 

580 590 600 610 620 

* * ****** * 

OspC-B31 AAT TCA GTT AAA GAG CTT ACA AGC CCT GTT GTG GCA GAA AGT CCA AAA 

TTA AGT CAA TTT CTC GAA TGT TCG GGA CAA CAC CGT CTT TCA GGT TTT 

1. OspC-PK 590 600 610 620 630 

I 1832 J a t a > 

2. OspC-TR570 580 590 600 610 

[ 1786 ] at > 

3. OspC-K4 580 590 600 610 620 

[ 1774 ] a at > 

630 
* * 

OspC-B31 AAA CCT TAA 
TTT GGA ATT 

1. OspC-PK 

I 1832 ] > 

2. OspC-TR 620 

( 1786 ] > 

3. OspC-K4 630 

I 1774 J > 
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9S//3S 

10 . 20 ^ 30 . <° * 

^ ™* Irr TTA TTT TTA TTG CTC TCA ATA TCT TGT TCT TTA GAT 
BO oopD GAT oil Z? HI Hi SS St AAC GAG AGT TAT AGA ACA AGA AAT CTA 



20 30 40 



1. P-Gau o 1° * a > 

[ 2804 ] 

!?0 30 40 

2. DK29 OS 10 20 ° U g > 

[ 2786 ] c 

in 20 30 40 

3. K48 osp 10 ^ > 

I 2786 ] 

* * ^ „* m rA aaa GAT TAC GAG TCA AAA AAA CAG AGT ATA 

BO OspO AAT GAA GOT GTA AAC TCA AAA GAT T^ ^ ^ ^ ^ ^ ^ ^ 

1. P-Gau O50 60 70 80 .90 > 

t 2804 ] 

2. DK2 
( 2786 ] 



2. DK29 OS50 60 70 80 ^90 > 

3. K48 ospSO 60 70 80 90 > 

[ 2786 ] 9 



100 no 



120 130 140 



o** ttr AAT CAG CTA TTG GGG CAA ACT ACA AAT TCA CTA AAA 
BO ospD ^ CTA GGT GAA TTA AAT CAG CTA ^ ^ ^ ^ ^ ^ ^ ^ m 

1. P-Gau o 100 HO 120 130 140^ ^ 
[ 2804 ] 

2. DK2 

[ 2786 ] 

3. K48 CSP 100 HO 120 130 140^ ^ 
1 2786 ] 



2. DK29 os 100 HO 120 130 140^ 



150 160 



170 180 190 



* 



BO -*> 35 S S 5S S cta SS 3 tta S K £ SS S 5K 

1. P-Cauo 150 160 170 180 ..??..> 
t 2804 J *. 

2. 0K29OS 150 160 170 180 

[ 2786 ] 

3. K48C8P ISO 160 170 180 190 ^ 
I 2786 ] 

200 210 220 ^ 230 ^ 240 

- "*> ^ « S - S S £ TCA SS S TTA 3S s K 22 X 
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1. P-Gau o- 200 210 220 230 240 

[ 2804 ] > 

2. DK29 os 200 210 220 230 240 

I 2786 ] > 

3. K48 osp 200 210 220 230 240 

t 2786 ] > 

250 260 270 280 

* * * * * * * * * 

BO ospD GCA GAT CAG GTA AAA GGT CAA CAA CAA ATA TGC ACG ATT TAG CTC AAA ' 

CGT CTA GTC CAT TTT CCA GTT GTT GTT TAT ACG TGC TAA ATC GAG TTT 

1. P-Gau o 250 260 270 280 

I 2804 ] > 

2. DK29 oe 250 260 270 280 

[ 2786 J g > 

3. K48 osp 250 260 270 280 

[ 2786 ] g > 

290 300 310 320 330 

* * * * * * * * * * 

BO ospD TGG CAG AAA TAG ATT TAG AAA AAA TAA AGG AAT CTA GTG ATA AAG TAA 

ACC GTC TTT ATC TAA ATC TTT TTT ATT TCC TTA , GAT CAC TAT TTC ATT 

1. P-Gau 290 300 310 320 330 

[ 2804 ] > 

2* DK29 o290 300 310 320 330 
[ 2786 ] . . > 

3. K48 os290 300 310 320 330 
( 2786 ] ... > 

340 350 360 370 380 

* * * * * * * * * 

BO ospD TAG TTG CGG CTA ATG TTG CGA AAG AAG CAT ATA ACC TTA CTA AAG CAG 

ATC AAC GCC GAT TAC AAC GCT TTC TTC GTA TAT TGG AAT GAT TTC GTC 

1* P-Gau o 340 350 360 370 380 
[ 2804 ] > 

2. DK29 oe 340 350 360 370 380 

( 2786 ] > 

3. K48 osp 340 350 360 - 370 380 

I 2786 ] > 

390 400 410 420 430 

* * * * * * * * 

BO ospD TAG AAC AAA ATA TGC AAA AAC TGT ACA AAG AGC AAG AAG AGC AAC TAA 

ATC TTG TTT TAT ACG TTT TTG ACA TGT TTC TCG TTC TTC TCG TTG ATT 

1. P-Gau o 390 400 410 420 430 

{ 2804 ] > 

2. DK29 os 390 400 410 420 430 



Figure 39 (2 of A) 
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91)122 

I 2786 ] > 

3. K48 osp 390 400 410 420 430 
t 2786 ] > 

440 450 460 470 480 

********** 

BO oepD AAC ACT ATC TGA TTC TGA TGA AAC AG A ACG AGT TTC TGA TGA AAT AAA 

TTG TGA TAG ACT AAG ACT ACT TTG TCT TGC TCA AAG ACT ACT TTA TTT 

1. P-Gau o 440 450 460 470 480 

[ 2804 ] > 

2. DK29 os 440 450 460 470 480. 
[ 2786 ] g * > 

3* K48 osp 440 450 460 470 480 
[ 2786 ] g > 

490 500 510 520 

* ** * * * * ** 

BO ospD ACA AGC TAA AGA GGC TGT AGA AAT AGC TTG GAA AGC CAC AGT AAA AGT 

TGT TCG ATT TCT CCG ACA TCT TTA TCG AAC CTT TCG GTG TCA TTT TCA 

1. P-Gau o 490 500 510 520 
( 2804 ] > 

2* DK29 os 490 500 510 520 
( 2786 ] > 

3. K48 osp 490 500 510^ 520 

[ 2786 ] . ; > 

530 540 550 560 570 

* ** **** * * * 

BO OspD AAA AGA TGA GTT AAT TGA TGT AGA AAA TGC AGT CAA AGA GGC ATT GGA 

TTT TCT ACT CAA TTA ACT ACA TCT TTT ACG TCA GTT TCT CCG TAA CCT 

1. P-Gau 530 540 550 560 570 

( 2804 ] > 

2. DK29 O530 540 550 560 570 

( 2786 ] > 

3- K48 os530 540 S50 560 570 
( 2786 ] > 

580 590 600 610 620 

** * * * " * ** * 

BO ospD TAA AAT AAA GAC AGA AAC CGC GAA CAA TAC AAA ACT TAC AGA TAT AGA 

ATT TTA TTT CTG TCT TTG GCG CTT GTT ATG TTT TGA ATG TCT ATA TCT 

1. P-Gau o 580 590 600 610 620 

t 2804 ] * > 

2. DK29 os 580 590 600 610 620 

[ 2786 ] > 

3- K48 osp 580 590 600 610 620 
[ 2786 ] g > 



Figure 39 (3 of A) 
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630 640 650 660 670 

********** 
BO ospD AGA AGT AGO AGA GTT AGT ATT ACA GAT AGC CAA AAA TGT AGC GGA AAT 

TCT TCA TCG TCT CAA TCA TAA TGT CTA TCG GTT TTT ACA TCG CCT TTA 

1. P-Gau o 630 640 650 660 670 

{ 2804 ] a > 

2. DK29 OS 630 640 650 660 670 

[ 2786 ] a > 

3. K48 osp 630 640 650 660 670 

[ 2786 ] a # > 

680 690 700 

****** 

BO ospD AGC GCA AGA AGT TGT GGC CTT GTT AAA TAC TT 

TCG CGT TCT TCA ACA CCG GAA CAA TTT ATG AA 

1, P-Gau o 680 690 700 

[ 2804 ] > 

2, DK29 os 680 690 700 

[ 2786 ] > 

3, K48 osp 680 690 700 

[ 2786 ] > 



Figure 39 (A of A) 
Of fnOTTTJrrr fMii-r-r /*%iu w **** 



WO 95/12676 



PCT/US94/12352 
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P41 

Sequence Range: 1 to 1011 



10 20 30 40 

ATG ATT ATC AAT CAT AAT ACA TCA OCT ATT AAT GCT TCA AGA AAT AAT 
TAC TAA TAG TTA GTA TTA TGT ACT CGA TAA TTA CGA ACT TCT TTA TTA 
Met He He Asn His Asn Thr Ser Ala lie Asn Ala Ser Arg Asn Asn> 



50 60 



70 80 90 



GGC ATT AAC GCT GCT AAT CTT AGT AAA ACT CAA GAA AAG CTT TCT AGT 
CCG TAA TTG CGA CGA TTA GAA TCA TTT .. TGA -GTX' .CTT TTC GAA AGA TCA 
Gly He Asn Ala Ala Asn Leu Ser Lys Thr Gin Glu Lys Leu Ser £er> 



100 



HO 120 130 140 



GGC TAC AGA ATT AAT CGA GCT TCT GAT GAT GCT GCT GGC ATG CGA GTT 
CCG ATG TCT TAA TTA GCT CGA AGA CTA CTA CGA CGA CCG. TAC CCT CAA 
Gly Tyr Arg He Asn Arg Ala Ser Asp Asp Ala Ala Gly Met Gly Val> 



150 



160 170 180 ISO 



TCT GGT AAG ATT AAT GCT CAA ATA AGA GGT TTG TCA CAA GCT TCT AGA 
AGA CCA TTC TAA TTA CGA GTT TAT TCT CCA AAC AGT GTT CGA AGA TCT 
Ser Gly Lys He Asn Ala Gin He Arg Gly Leu Ser Gin Ala Ser Arg> 

200 210 220 230 240 

AAT ACT TCA AAG GCT ATT AAT TTT ATT CAG ACA ACA GAA GGG AAT TTA 
TTA TGA AGT TTC CGA TAA TTA AAA TAA GTC TGT TGT CTT CCC TTA AAT 
Asn Thr Ser Lys Ala He Asn Phe He Gin Thr Thr Glu Gly Asn Leu> 

250 260 270 280 

AAT GAA GTA GAA AAA GTC TTA GTA AGA ATG AAG GAA TTG CCA GTT CAA 
TTA CTT CAT CTT TTT CAG AAT CAT TCT TAC TTC CTT AAC CGT CAA GTT 
Asn Glu Val Glu Lys Val Leu Val Arg Met Lys Glu Leu Ala Val Gln> 



290 



300 310 320 330 



w ™ - 

TCA GGT AAC GGC ACA TAT TCA GAT CCA GAC AGA GGT TCT ATA CAA ATT 
AGT CCA TTG CCG TGT ATA AGT CTA CGT CTG TCT CCA AGA TAT GTT TAA 
Ser Gly Asn Gly Thr Tyr Ser Asp Ala Asp Arg Gly Ser He Gin Ile> 



340 



3S0 360 370 380 



GAA ATA GAG CAA CTT ACA GAC GAA ATT AAT AGA ATT GCT GAT CAA GCT 
CTT TAT CTC GTT GAA TGT CTG CTT TAA TTA TCT TAA CGA CTA GTT CGA 
Glu He Glu Gin Leu Thr Asp Glu He Asn. Arg He Ala Asp Gin Ala> 
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390 
* 






400 
* 




410 






420 






4 


« 


CAA 
GTT 
Gin 


TAT 
ATA 
Tyr 


AAC 
TTG 
Asn 


CAA 
GTT 
Gin 


ATG 
TAC 
Met 


CAC 
GTG 
His 


ATG 
TAC 
Met 


TTA 
AAT 
Leu 


TCA AAC 
AGT TTG 
Ser Asn 


AAA 
TTT 

Lys 


TCT 
AGA 
Ser 


GCT 
CGA 
Ala 


TCT 
AGA 
Ser 


CAA 
GTT 
Gin 


AAT 
TTA 
Asn> 




440 
« 






450 




• 


460 
* 


* 


470 
* 






480 
* 


GTA 
CAT 
Val 


AGA 
TCT 
Arg 


ACA 
TGT 
Thr 


GCT 
CGA 
Ala 


GAA 
CTT 
Glu 


GAG 
CTC 
Glu 


CTT 
GAA 
Leu 


GGA 
CCT 
Gly 


ATG 
TAC 
Met 


CAG 
GTC 
Gin 


CCT 
GGA 
Pro 


GCA 
CGT 
Ala 


AAA 
TTT 
Lys 


ATT 
TAA 
He 


AAC 
TTG 
Asn 


ACA 
TGT 
Thr> 




* 


490 
* 


« 


500 




* 


510 
* 


* 


520 


* 




CCA 
GGT 
Pro 


GCA 
CGT 
Ala 


TCA 
AGT 
Ser 


CTT 
GAA 
Leu 


TCA 
AGT 
Ser 


GGG 
CCC 
Gly 


CTT 
GAA 
Leu 


CAA 
GTT 
Gin 


GCG 
CGC 
Ala 


TCT 
AGA 
Ser 


TGG ACT 
ACC TGA 
Trp Thr 


TTA 
AAT 
Leu 


Ant 

Aon 

TCT 
Aro 


GTT 
CAA 
Val 


CAT 
GTA 
His> 


530 
• 




* 


540 
♦ 




* 


550 
« 


* 


560 
• 




• 


570 






GTT 
CAA 
Val 


GGA 
CCT 
Gly 


GCA 
CGT 
Ala 


ACC 
TGG 
Thr 


CAA 
GTT 
Gin 


GAT 
CTA 
Asp 


GAA 
CTT 
Glu 


GCT 
CGA 
Ala 


ATT 
TAA 
He 


GCT 
CGA 
Ala 


GTA 
CAT 
Val 


AAT 
TTA 
Asn 


ATT 
TAA 
He 


« 

TAT 
ATA 
Tyr 


^=^./\ 
CGT 
Ala 


• 

GCT 
CGA 
Ala> 


580 


• 


590 
« 




* 


600 
• 






610 
* 


* 


£ 


120 




AAT 
TTA 
Asn 


GTT 
CAA 
Val 


GCA 
CGT 
Ala 


AAT 
TTA 
Asn 


CTT 
GAA 
Leu 


TTC 
AAG 
Phe 


TCT 
AGA 
Ser 


GGT 
CCA 
Gly 


GAG 
CTC 
Glu 


GGA GCT CAA 
CCT CGA GTT 
Gly Ala Gin 


ACT 
TGA 
Thr 


GCT 
CGA 
Ala 


CAG 
GTC 

g:~ 


GCT 
CGA 
Ala > 


• 


630 
* 






640 
• 


• 


650 
* 




• 


660 




• 






GCA 
CGT 
Ala 


CCG 
GGC 
Pro 


GTT 
CAA 
Val 


CAA 
GTT 
Gin 


GAG 
CTC 
Glu 


GGT 
CCA 
Gly 


GTT 
CAA 
Val 


CAA 
GTT 
Gin 


CAG 
GTC 
Gin 


GAA 
CTT 
Glu 


GGA GCT 
CCT CGA 
Gly Ala 


CAA 
GTT 
Gin 


CAG 
GTC 
Gin 


CCA 
GGT 
Fro 


GCA 
CGT 
A la;' 


• 


680 




• 


690 
• 




* 


700 
* 


« 


710 




* 


7 2 C 


CCT 
GGA 
Pro 


GCT 
CGA 
Ala 


ACA 
TGT 
Thr 


GCA 
CGT 
Ala 


CCT 
GGA 
Pro 


TCT 
AGA 
Ser 


CAA 
GTT 
Gin 


GGC 
CCG 
Gly 


GGA GTT 
CCT CAA 
Gly Val 


AAT 
TTA 
Asn 


TCT 
AGA 
Ser 


CCT 
GGA 
Pro 


GTT 
CAA 
Val 


AAT 
TTA 
Asn 


GTT 
CAA 
Val> 




• 


730 


♦ 


740 
* 




* 


750 
• 




* * 


760 


• 




ACA 
TGT 
Thr 


ACT 
TGA 
Thr 


ACA 
TGT 
Thr 


GTT 
CAA 
Val 


GAT 
CTA 
Asp 


GCT 
CGA 
Ala 


AAT 
TTA 
Asn 


ACA 
TGT 
Thr 


TCA 
AGT 
Ser 


CTT 
GAA 
Leu 


GCT 
CGA 
Ala 


AAA 
TTT 
Lys 


ATT 
TAA 
He 


GAA 
CTT 
Glu 


AAT 
TTA 
A sr. 


GCT 
CGA 
Ala> 


770 




* 


780 
• 




• 


790 
« 


* 


800 






810 






ATT 
TAA 
lie 


AGA 
TCT 
Arg 


ATG 
TAC 
Met 


ATA 
TAT 
He 


AGT 
TCA 
Ser 


GAT 
CTA 
Asp 


CAA 
GTT 
Gin 


AGG 
TCC 
Arg 


GCA 
CGT 
Ala 


AAT 
TTA 
Asn 


TTA GGT GCT 
AAT CCA CGA 
Leu Gly Ala 


m 

TTC 
AAG 
Phe 


CAA 
GTT 

Glr. 


AAT 
TTA 
Asr.> 
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820 



830 840 850 860 







* 




* 




* 


* 


• • * 




* 




AGA 


CTT 


GAA 


TCT 


ATA 


AAG 


AAT 


AGT ACT GAG TAT GCA ATT 


GAA 


AAT 


CTA 


TCT 


GAA 


CTT 


AGA 


TAT 


TTC 


TTA 


TCA TGA CTC ATA CGT TAA 


CTT 


TTA 


GAT 


Arg 


Leu 


Glu 


Ser 


lie 


Lys 


Asn 


Ser 


Thr Glu Tyr Ala He 


Glu 


Asn 


Leu> 




870 






680 


• 


890 900 
« * « 


* 


-910,. 
• 


« 

AAA 


* 

GCA 


TCT 


• 

TAT 


GCT 


CAA 


ATA 


AAA 


GAT GCT ACA ATG ACA 


GAT 


GAG 


GTT 


TTT 


CGT 


AGA 


ATA 


CGA 


GTT 


TAT 


TTT 


CTA CGA TGT TAC TGT 


CTA 


CTC 


CAA 


Lys 


Ala 


Ser 


Tyr 


Ala 


Gin 


lie 


Lys 


Asp Ala Thr Met Thr 


Asp 


Glu 


Val> 




920 






930 
* 




* 


940 - -950 

* * • 




* 


960 
• 


GTA 


GCA 


GCA 


ACA 


* 

ACT 


AAT 


ATG 


ATT 


TTA ACA CAA TCT GCA 


ATG 


GCA 


ATG 


CAT 


CGT 


CGT 


TGT 


TGA 


TTA 


TAC 


TAA 


AAT TGT GTT AGA CGT 


TAC 


CGT 


TAC 


Val 


Ala 


Ala 


Thr 


Thr 


Asn 


Met 


He 


Leu Thr Gin Ser Ala 


Met 


Ala 


Met> 






570 






980 




990 1000 








* 






* 




« 




♦ ♦ * 


« 







ATT GCG CAG GCT AAT CAA GTT CCC CAA TAT GTT TTG TCA TTG CTT AGA 
TAA CGC GTC CGA TTA GTT CAA GGG GTT ATA CAA AAC AGT AAC GAA TCT 
He Ala Gin Ala Asn Gin Val Pro Gin Tyr Val Leu Ser Leu Leu Arg> 

1010 • 
• 

TAA 
ATT 
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Alignment List 



Search Analysis for Sequence: B31-41kD Matrix: DMA database matrix 

Search from 1 to 1011 where origin = 1 Score Region from 1 to 1011 

Date: October 22,1993 Maximum possible score: 4044 

Time: 15:03:24 

Database: UserFolder: 41 kD Flagellin clones 



10 20 30 • 40 

*■ * * * * * * *f 

B31-41kD ATG ATT ATC AAT CAT AAT ACA TCA GCT ATT AAT GCT TCA AGA AAT AAT 
TAC TAA TAG TTA GTA TTA TGT AGT CGA TAA" TTA-CGA- AGT TCT TTA TTA 

1. KA-41kD 10 20 30 40 

I 3996 ] .... . > 

2. P-Gau-4 10 20 30 40 

[ 3696 ] > 

3. BO-41kD 10 20 30 40 

t 3684 ] > 

4. DK29-41 10 20 30 40 

I 3672 ] > 

5. PKO-41k 10 20 30 40 

I 3672 ] > 



50 60 70 80 90 

* * * * * * * * » ^ 

B31-41kD GGC ATT AAC GCT GCT AAT CTT AGT AAA ACT CAA GAA AAG CTT TCT AGT 
CCG TAA TTG CGA CGA TTA GAA TCA TTT TGA GTT CTT TTC GAA AGA TCA 

1. KA-41kD50 60 70 80 90 

[ 3996 ] > 

2. P-Gau-450 60 70 80 90 

I 3696 ] .c t c g > 

3. EO-41kD50 60 70 80 90 

I 3684 ] .c t c g > 

4. DK29-4150 60 70 80 90 

[ 3672 ] ..t t g > 

5. PKO-41k50 60 70 80 90 

I 3672 J .c t c g ... .c > 

100 110 120 130 140 

* * * .* * ^ * « • * 

B31-41kD GGC TAC AGA ATT AAT CGA GCT TCT GAT GAT GCT GCT GGC ATG GGA GTT 



CCG ATG TCT TAA TTA GCT CGA AGA CTA CTA CGA CGA CCG TAC CCT CAA 



FIGURE 41 (L of 8) 
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1. KA-41kD 100 HO 120 I 30 140 

I 3996 ] --g 

2. P-Gau-4 100 110 120 130 ^ 140 ^ 
[ 3696 ] --t 

3. B0-41kD 100 HO 120 130 ^ 140 ^ 
t 3684 ] --t 

4. DK29-41 100 U0 120 130 "0 
I 3672 ) ? ..t a ; 

5. PKO-41k 100 HO 120 ' 130 140 " 
t 3672 ) .-t ; 



150 



160 170 180 190 



/un TTT GGT AAG ATT AAT GCT CAA ATA AGA GGT TTG TCA CAA GCT TCT AGA 
B31-41KD CCA TTC TAA TTA CGA GTT TAT TCT CCA AAC ACT GTT CGA AGA TCT 

1. KA-41kD 150 160 170 180 f 190 > 
I 3996 ] 

2. P-Gau-4 150 160 170 180 190 

I 3696 ) c c. ..c ..a > 

3. B0-41XD 150 160 170 180 190 
[ 3684 ] c c - a 

4. DK29-41 150 160 170 180 190 
t 3672 ) 9 -•• 

5. PKO-41K 150 160 170 180 190 
I 3672 ] c c - a ••■ ' 

200 210 220 230 240 

H31-41kD AAT ACT TCA AAG GCT ATT AAT TTT ATT CAG ACA ACA GAA GGG AAT TTA 
B31-41kD TTA TGA ACT TTC CGA TAA TTA AAA TAA GTC TGT TGT CTT CCC TTA AAT 

1. KA-41kD 200 210 220 230 240^ 
1 3996 ) 

2. P-Gau-4 200 210 220 230 24 0> 
I 3696 ] ..c a c a 

3. B0-41kD 200 210 
[ 3684 ] . .c a c 



3. B0-41KD 200 210 220 230 ^ 240^ 



4. DK29-41 200 210 220 230 240 
I 3672 ) ..c a a 

5. PKO-41K 200 210 220 230 240 
[ 3672 ] . -c a c 



.a 
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250 260 270 280 



B31-41kD AAT GAA GTA GAA AAA GTC TTA GTA AGA ATG AAG GAA TTG GCA GTT CAA 
TTA CTT CAT CTT TTT CAG AAT CAT TCT TAC TTC CTT AAC CGT CAA GTT 

1. KA-41kD 250 260 270 280 

{ 3996 ] 

2. P-Gau-4 250 260 270 280 

C 3696 ] t a a 

3. BO-41kD 250 260 270 .280 
t 3684 J t «.. ..a a 



4. DK29-41 250 260 270 280 
t 3672 ] t '.. .— /.. .. a ... > 



.> 



5. PKO-41k 250 260 270 280 
[ 3672 ] t a a 

290 300 * 310 320 330 

* * * * * « * » , * * 

B31-41kD *TCA GOT AAC GGC ACA TAT TCA GAT GCA GAC AGA GGT TCT ATA CAA ATT 
AGT CCA TTG CCG TGT ATA AGT CTA CGT CTG TCT CCA AGA TAT GTT TAA 

1- KA-41k290 300 310 320 330 
I 3996 ] 

2. P-Gau-290 300 310 320 330 
I 3696 ] a ..g c , „ 



3. BO-41k290 300 310 320 330 



[ 3684 ] a ..g 



.c t. 



.> 



4. DK29-4290 300 310 320 330 

I 3672 ] t c > 

5. PKO-41290 300 310 320 330 

I 3672 ] a ..g c t - g . . .> 



340 350 360 370 380 

* * * * * * * * « 

B31-41kD GAA ATA GAG CAA CTT ACA GAC GAA ATT AAT AGA ATT GCT GAT CAA GCT 
CTT TAT CTC GTT GAA TGT CTG CTT TAA TTA TCT TAA CGA CTA GTT CGA 

1. KA-41kD 340 350 360 370 380 

I 3996 ] _ _ > 

2. P-Gau-4 340 350 360 370 380 

I 3696 ] _ g > 

3. BO-41kD 340 350 360 370 " 380 

I 3684 ] g > 

4. DK29-41 340 350 360 ' 370 380 
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/os 



>//33 



I 3672 ] 

5. PKO-41k 340 350 360 370 380 
I 3672 } g 



390 400 410 420 430 

« • * 

B31-41JCD CAA TAT AAC CAA ATG CAC ATG TTA TCA AAC AAA TCT GCT TCT CAA AAT 
GTT ATA TTG GTT TAC GTG TAC AAT ACT TTG TTT AGA CGA AGA GTT TTA 

1. KA-41kD 390 400 410 420 430 



2. P-Gau-4 


390 


400 


410 


420 




430 




3- BO-41kD 


390 


400 


410 


420 




430 




4, DK29-41 


390 


400 


410 


420 




430 




5. PKO-41k 


390 


400 


410 


420 




430 





440 450 460 470 480 

* «** .** * * * 

B31-41kD GTA AGA ACA GCT GAA GAG CTT GGA ATG CAG CCT GCA AAA ATT AAC ACA 
CAT TCT TGT CGA CTT CTC GAA CCT TAC GTC GGA CGT TTT TAA TTG TGT 

1. KA-41kD 440 450 460 470 480 

[ 3996 1 : > 

2 . p-Gau-4 440 450 460 470 480 

t 3696 ] . .a > 

3. BO-41kD 440 450 460 470 480 

I 3684 ] ... .a > 

4. DK29-41 440 450 460 470 480 

[ 3672 ] a a c > 

5. PKO-41k 440 450 460 470 480 

[ 3672 ] ... .a > 



490 500 510 520 

* * * **** * * 

B31-41kD CCA GCA TCA CTT TCA GGG CTT CAA GCG TCT TGG ACT TTA AGA GTT CAT 
GGT CGT AGT GAA AGT CCC GAA GTT CGC AGA ACC TGA AAT TCT CAA GTA 

1. KA-41kD 490 500 510 520 
[ 3996 ] tc 

2. P-Gau-4 490 500 510 520 
[ 3696 ] a tc t 
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3. B0-41kD 490 500 510 520 

I 3684 ] : a tc t 

4. DK29-41 490 500 510 520 

I 3672 ) ... ..g a tc t 

5. PKO-41k 490 500 510 520 

I 3672 ] a tc t 



530 540 550 560 570 

* * * * »*.* * * * 

B31-4lJcD GTT GGA GCA ACC CAA GAT GAA GCT ATT GCT GTA AAT ATT TAT GCA GCT 
CAA CCT CGT TGG GTT CTA CTT CGA TAA CGA CAT TTA TAA ATA CGT CGA 



1. KA-41k530 


540 


550 


560 


570 


2. P-Gau-530 
[ 3696 ] ..g ... 


540 


550 


560 


570 


3. BO-41k530 
[ 3684 ] . .g . . . 


540 


550 


560 


570 


4. DK29-4530 
[ 3672 ] ..g ... 


540 


550 


560 


570 


5. PKO-41530 
I 3672 ] ..g ... 


540 


550 


560 


570 



580 590 600 610 620 

* * « ***** * 

B31-41kD AAT GTT GCA AAT CTT TTC TCT GGT GAG GGA GCT CAA ACT GCT CAG GCT 
TTA CAA CGT TTA GAA AAG AGA CCA CTC CCT CGA GTT TGA CGA GTC CGA 

1. KA-4 IkD 580 590 600 610 620 

I 3996 ] > 

2. P-Gau-4 580 590 600 610 620 

I 3696 ] t g g > 

3. BO-41kD 580 590 600 610 620 

[ 3684 ] t g g > 

4. DK29-41 580 590 600 610 620 

[ 3672 ] a ..a g g a. .> 

5. PKO-41k 580 590 600 * 610 620 

I 3672 ] t g g > 



630 



640 



650 



660 



670 



B31-41kD GCA CCG GTT CAA GAG GGT GTT CAA CAG GAA GGA GCT CAA CAG CCA GCA 
CGT GGC CAA GTT CTC CCA CAA GTT GTC CTT CCT CGA GTT GTC GGT CGT 
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1. KA-41kD 630 640 650 660 670 

[ 3996 J > 

2. P-Gau-4 630 640 650 660 670 

I 3696 ] t c. ... g.a g ..a ... a..> 

3. BO-41kD 630 640 650 660 670 

I 3684 ] t c. ... g.a g ..a ... a..> 

4. DK29-41 630 640 650 660 670 

[ 3672 J t a ... .c a a > 

5. PKO-41k 630 640 650 * . 660 670 

•[ 367i J t c. ... g.a g ..a ... a..> 

680 690 700 " 710 720 

* * * * * * * * m# 

B31-41kD CCT GCT ACA GCA CCT TCT CAA GGC GGA GTT AAT TCT CCT GTT AAT GTT 
GGA CGA TGT CGT GGA AGA GTT CCG CCT CAA TTA AGA GGA CAA TTA CAA 

1. KA-41kD 680 690 700 710 720 

I 3996 ] > 

2. P-Gau-4 680 690 700 710 720 

I 3696 3 a t -.; > 

3. BO-41kD 680 690 700 710 720 
[ 3684 J a t : 

4. DK29-41 680 690 700 710 720 
I 3672 3 g g ..t : 

5. PKO-41k 680 690 700 710 720 
I 3672 3 a t ... : 

730 740 750 760 

* * * * « * « * « 



B31-41kD ACA ACT ACA GTT GAT GCT AAT ACA TCA CTT GCT AAA ATT GAA AAT GCT 
TGT TGA TGT CAA CTA CGA TTA TGT AGT GAA CGA TTT TAA CTT TTA CGA 



1. KA-41KD 730 740 750 760 
I 3996 ] 

2. P-Gau-4 730 740 750 760 
I 3696 ] c a . 

3. BO-41kD 730 740 , 750 760 
I 3684 3 c a . 

4. DK29-41 730 740 750 760 
1 3672 ] c t a . 

5. PKO-41k 730 740 750 760 
[ 3672 ] c a . 
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770 780 790 800 810 

« * * * * *"« * 

B31-41WD ATT AGA ATG ATA AGT GAT CAA AGG GCA AAT TTA GGT GCT TTC CAA AAT 
TAA TCT TAC TAT TCA CTA GTT TCC CGT TTA AAT CCA CGA AAG GTT TTA 

1. KA-41k770 780 790 800 810 

I 3996 ] 5 

2. P-Gau-770 780 790 800 810 

( 36960 a 5 

". : y 

3. BO-41k770 780 790 800 810 

[ 3684 ] a ... i 

4. DK29-4770 780 790 800 810 

I 3672 ] a ; 5 

5. PKO-41770 780 790 800 810 

I 3672 ] a > 

820 830 840 850 860 

* * * * * * * * « 

B31-41kD AGA CTT GAA TCT ATA AAG AAT AGT ACT GAG TAT GCA ATT GAA AAT CTA 
TCT GAA CTT AGA TAT TTC TTA TCA TGA CTC ATA CGT TAA CTT TTA GAT 

1. KA-41kD 820 830 840 850 860 

I 3996 ] = 

2. P-Gau-4 820 830 840 850 860 

I 3696 ] c t ; 

3. BO-41kD 820 830 840 850 860 

I 3684 ] c t : 

4. DK29-41 820 830 840 850 860 

I 3672 ] g 9 t c . . .: 

5. PKO-41k 820 830 840 850 860 

[ 3672 ] c t : 



870 880 890 900 910 

« * * * * * * * * * 

B31-41kD AAA GCA TCT TAT GCT CAA ATA AAA GAT GCT ACA ATG ACA GAT GAG GTT 
TTT CGT AGA ATA CGA GTT TAT TTT CTA CGA TGT TAC TGT CTA CTC CAA 

1- KA-41kD 870 880 890 900 910 
I 3996 ) 

2. P-Gau-4 870 * 880 890 900 910 

I 3696 ] 

3. BO-41kD 870 880 890 900 910 

[ 3684 ] 

4. DK29-41 870 880 890 900 910 

I 3672 ) 
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5. PK0-41k 870 880 890 900 910 
I 3672 ] > 

920 930 940 950 960 

B31-41kD GTA GCA GCA ACA ACT AAT ATG ATT TTA ACA CAA TCT GCA ATG GCA ATG 
CAT CGT CGT TGT TGA TTA TAC TAA AAT TGT GTT AGA CGT TAC CGT TAC 

1. KA-41kD 920 930 940 950 960 
[ 3996 ] St • • 

2. P-Gau-4 920 930 940' 950 960 

I 3696 ] t gt ._t > 

3. BO-41kD 920 930 940 950 ' 960* 
[ 3684 ] t gt t > 

4. DK29-41 920 930 940 # 950 960 
[ 3672 ] t gt ... .g .. . 

5. PK0-41k 920 930 940 950 960 

[ 3672 ] t a -gt t > 

970 980 990 1000 

* *« * * * * * * 

B31-41kD ATT GCG CAG GCT AAT CAA GTT CCC CAA TAT GTT TTG TCA TTG CTT AGA 
TAA CGC GTC CGA TTA GTT CAA GGG GTT ATA CAA AAC AGT AAC GAA TCT 

1. KA~41kD 970 980 990 1000 

I 3996 ] > 

2. P-Gau-4 970 980 990 1000 

[ 3696 ] ... ..a t > 

3. B0-41kD 970 980 990 1000 

I 3684 ) a t > 

4. DK29-41 970 980 990 1000 
i 3672 ] ..a t 

5. PK0-41k 970 980 990 1000 
[ 3672 ] a t 

1010 
* 

B31-41kD TAA 
ATT 

2. P-GaulOlO 
[ 3696 ) ...> 
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Sequence Range: 1 to 822 

10 20 30 40 

. «« ««* * * * 

n<;nA-B31 ATG AAA AAA TAT TTA TTG CGA ATA GGT CTA ATA TTA GCC TTA ATA GCA 
USPA ^ ^ c ^ ATA pjtf ccr TAT CCA CAT TAT AAT CGG-AAT TAT CGT 

05PA-B31 10 20 30 40 

I 3288 ] * . mmm> 

OspA-KA 10 20 <30 40 

I 3288 ] ... • 4 > 

OSPA-N40 10 20 30 40 

I 3276 ] > 

OspA-ZS7 10 20 30 40 

[ 3264 ] ? > 



OspA-25015 



10 20 30 40 



{ 2802 3 t > 

OspA-TRO 10 20 30 40 

[ 2648 ] > 

OspA-K48 10 20 30 40 

( 2584 ] > 

OspA-HE 11 10 20 30 40 

I 2580 ] > 

OspA-DK29 10 20 30 40 

1 2566 ] ; > 

OspA-Ip90 10 20 30 40 

[ 2562 3 a > 

OspA-BO . 10. 20 30 40 

[ 2558 ] ;> 

OSPA-IP3 10 20 30 40 

I 2558 ] > 

OspA-PKO 10 20 30 40 

I 2558 ) > 

OspA-ACAI 10 20 30 40 

I 2556 3 > 

ospA-P-GAU 10 20 30 40 

[ 2544 ] > 

50 60 70 80 90 

* «**««« ««« 

OspA-B31 TGT AAG CAA AAT GTT AGC AGC CTT GAC GAG AAA AAC AGC GTT TCA GTA 
ACA TTC GTT TTA CAA TCG TCG GAA CTG CTC TTT TTG TCG CAA AGT CAT 
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OspA-B31 50 60 70 80 90 

[ 3288 ] 

70 80 90 



OspA-ZS.7 50 60 70 80 90 ^ 

OspA-25015 50 60 70 ^ \ ^° *° > 



OspA-KA SO 6° 

£ 3288 ) 

OspA-N40 50 60 70 80 90 

I 3276 ] 

OspA-ZS7 
I 3264 ] 

OspA-250: 
( 2802 } 

OspA-TRO 50 60 70 80 .90 . 

[ 2648 ] 

OspA-K48 50 60 70 80 90 

[ 2584 ] t - a • • * 

OspA-HEll 50 60 70 80 90 

1 2580 J C •• 

OspA-DK29 50 60 70 80 90 

I 2566 ] C -' a C 

OspA-Ip90 50 • 60 70 80 90 

( 2562 ) t - a C > 

OspA-BO SO 60 70 80 90 

t 2558 ) ..c c - a c > 

OSPA-IP3 50 60 70 80 90 

I 2S58 ) ..c c -* a c > 

OspA-PXO SO 60 70 80 90 

I 2S58 ] -.c t - a c > 

OspA-ACAI SO 60 70 80 90 

I 2556 ] ..c c -- a c > 

ospA-P-GAU 50 60 70 ,.8° 9 J> 

I 2544 1 -.c c - a c 



100 



110 120 130 140 



OspA-B31 GAT TTC CCT GGT GAA ATG AAA GTT- CTT GTA AGC AAA GAA AAA AAC AAA 

UspA BJi CTA AAC GGA CCA CTT TAC TTT CAA GAA CAT TCG TTT CTT TTT TTG TTT 

OSPA-B31 100 U0 120 130 140 

I 3288 ] . 

OSPA-KA 100 HO 120 130 140. 

I 3288 ] 



OspA-N40 100 



110 120 130 140 
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I 3276 ] c 

OspA-2S7 100 110 120 130 140 

I 3264 ] c 

OspA-25015 100 HO 120 130 140 

I 2802 ] • 

OspA-TOO 100 HO 120 130 140 

{ 2648 ] a 9- • 

OspA-K48 100 HO 120 , - 130 140 

( 258*-] a 9 c t g.. , 

OspA-HE 11 100 HO 120 -*30 140 

I 2580 ] a g t g.. . 

OspA-DK29 100 HO 120 130 140 

[ 2566 ) a 9 c t g.. . 

OspA-Ip90 100 HO 120 130* 140 

[ 2562 3 a g- ... c t g.. . 

OspA-BO 100 110 120 130 140 

{ 2558 3 9 c 9.- - 

OSPA-IP3 100 HO 120 130 % 140 

I 2558 3 9 . -t -.-t g. . . 

OspA-PKO 100 110 120 130 14 0 

I 2558 3 9 c 9-. • 

OspA-ACAI 100 110 120 ' 130 140 

[ 2556 3 -g c • 

ospA-P-GAU . 100 HO 120 130 14 0 

I 2544 3 g t .3 • • - 

150 160 .170 180 190 

« « «*•«*« « • 

OspA-B31 GAC GGC AAG TAC GAT CTA ATT GCA ACA GTA GAC AAG CTT GAG CTT AAA 
CTG CCG TTC ATG CTA GAT TAA OCT TGT CAT CTG TTC GAA CTC GAA TTT 

OspA-B31 150 160 170 180 190 

I 3288 3 

OspA-KA 150 160 170 180 190 

[ 3288 3 

OspA-N40 150 160 170 180 190 

[ 3276 3 

OspA-7S7 150 160 170 180 ■ 190 

[ 3264 ] 

OspA-25015 150 160 170 180 190 

I 2802 3 ag g • 
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//3 y 

OsjA-TRO 150 . 160 "0 180 190 

I 2648 ] .-t ..t ..a ... ag g 

0*»-K48 150 160 170 180 X90 

( 2584 J t ..a ... ag. ... gag 

OspA-KE 11 150 1 

I 2580 ] . .t ..t ..a ... ag. 



OspA-KEll 150 160 "0 180 190 



• a • > 



OspA-bk29 • 150 
I 2566.] 



160 170 180 190 



OspA-Ip90 150 
I 2562 ] ..t ..t ..a ... ag 



t . .a ... ag. ... gag > 

160 170 ^ 180 190 



• > 



Ospfc-BO 

£ 2558 ] t ag. 

OSPA-IP3 150 



150 160 170 180 190 



• a . . .> 



160 170 180 190 



1 25S8 3 t ag ag a.. ... ..a .,.> 

OspA-PKO- " 150 160 170 1*80 190 

I 2558 3 t ag ag a a ...> 

Ospfc-ACAI 150 160 170 180 ' 190 

i 2556 3 t ag ag 



a a > 



160 170 180 190 



ospA-P-GAU 150 
1 2544 3 . v -.t ag ag a a ...> 

200 210 220 230 240 

♦ ■*••••• * * 

rwiA-Bll GGA ACT TCT GAT AAA AAC AAT GGA TCT GGA GTA CTT GAA GGC GTA AAA 
^ CCT TGA AGA CTA TTTTTG TTA CCT AGA CCT CAT GAA CTT CCG CAT TTT 

200 210 220 230 240 



OspA-B31 

[ 3288 3 

OsiA-KA 200 210 220 -"230 240 

I 3288 3 > 

OSIA-N40 200 210 220 230 240 

I 3276 3 

OSPA-ZS7 200 210 220 230 240 

I 3264 ] 

OspA-25015 200 210 220 230 240 

t 2802 3 a g ..g 

OspA-TRO 200 210 220 230 240 

£ 2648 ] 9- • - c --t ac c ' a ' 



> 



> 



OspA-K48 
[ 2584 3 



200 210 220 230* 240 
c . .t ac t .a. . . - 



FIGURE 42 (4 of 16) 



WO 95/12676 



PCT/US94/12352 



<//33 



OSP^-HEII 200 210 220 230 240 

£ 2580 ] c 3 * 

-inn 210 220 230 240 

OSJA-DK29 200 210 ar 

£ 2566 ] c ** C C ' 

^ onn 210 220 230 240 

OspA-Ip?0 200 2iu ^ t # 

I 2562 J c 

OSPA-BO 200 210 220 230 240 

I 2558 ] • ^ C ••• ' ,9 :* 9 C aC ' 

OSPA-IP3 200 210 220" 230 240 

I 2558 ] « t 9 C a<= * 

OS*-™ 200 210 220 230 240 

i 2S58 ] « c 9 " g "' — ac * — > 

OspA-ACAI 200 210 220 230 240 

I 2556 ] 9 C 9 . C aC * 

ospA-P-GAU 200 210 220 230 240 

I 2544 ] « c 9 ..t ac. ...> 



250 



260 270 280 



OsnA-B31 GCT CAC AAA ACT AAA CTA AAA TTA ACA ATT TCT CAC CAT CTA GGT CAA 

OspA B31 CCT CTG TTT TCA TTT CAT TTT AAT TGT TAA AGA CTG CTA GAT CCA GTT 

OSPA-B31 250 260 270 280 

I 3288 ] 

OspA-KA 250 260 270 ' 280 

[ 3288 ] 

OSPA-N40 250 260 270 280 

I 3276 ] 

OSPA-2S7 2S0 260 270 280 

I 3264 ] 

OspA-25015 250 260 270 ^..^^ 

[ 2802 ] c y 

OspA-TRO 250 260 270 280 

I 2648 ] t c a a.. a..> 

OSPA-K48 250 260 • 270 260 

I 2584 ] a -.t -.c ... a 

OspA-HE 11 250 260 270 280 

I 2580 J a g.. -.9 a.. a..> 

OSPA-DK29 250 260 270 280 

I 2566 ) a c 9-- ■ • - c ••• a 



OspA-lp90 



2so 260 270 280 
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{ 2562 ] . a c 9-. ..g a.. a . . 

OspA-BO 250 260 270 280 

[ 2558 ] .a c 9 a .. 

OSPA-IP3 250 260 270 280 

1 2558 ) .a c g a.. a .. 

OspA-PKO 250 260 270 280 

[ 2558 J .a c g a _ 

OspA-ACAI 250 260 270 - 280 

I 2556 ] .a c g a., a*.. 

ospA-P-GAU 250 260 270. 280 

[ 2544 ] .a c g \ a ... a _ 

290 300 310 320 330 

• * * * ** * # * # m 

OspA-B31 ACC ACA CTT GAA GTT TTC AAA GAA GAT GGC AAA ACA CTA GTA TCA AAA 
TGG TGT GAA CTT CAA AAG TTT CTT CTA CCG TTT TGT GAT CAT AGT TTT 

Osp^-B31 290 300 310 320 330 

[ 3288 ] > 

OspA-KA 290 300 310 320 330 

1 3288 ] ...> 

OspA-N40 290 300 310 320 330 

[ 3276 ) .. . > 

OspA-2S7 290 300 310 320 330 

I 3264 J . .'.> 

OspA-25015 290 300 310 320 330 

I 2802 ] a t.. . .g > 

OspA-TRO 290 300 310 320 330 

I 2648 ) t a t > 

OspA-K48 290 300 310 320 330 

[ 2584 ] . .t. .a. t a c t > 

OspA-HE 11 290 300 310 320 330 

1 2580 ] t a.c t g ...> 

OspA-DK29 290 300 310 320 330 

- i 2566 ] . .t .a. t a - t > 

OspA-Ip90 290 300 310 320 330 

I 2562 ] t a.c t > 

OspA-BO 290 300 310 320 330 

I 2558 ] t.c ... c t.. ..g ... .g.> 

OSPA-IP3 290 300 310 320 330 

[ 2558 ] t.c ... c t.. ..g ... .g.> 
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OspA-PKO 290 300 310 320 330 

{ 2558 ] t.c ... c t.. ..g ... .g.> 

OspA-ACAI 290 300 310 320 330 

[ 2556 ] t.c ... c t.. 

ospA-P-GAU 290 300 310 320 330 

[ 2544 ] ... ... t.c ... c. ..a t.. . .g ... . g .> 

340 350 360 % -370 380 

* * * «• * * * . • 

OspA-B31 AAA GTA ACT TCC AAA GAC AAG TCA TCA ACA GAA GAA AAA TTC AAT GAA 
TTT CAT TGA AGG TTT CTG TTC ACT ACT TCT^CTT-CTT* TTT AAG TTA CTT 

OspA-B31 340 350 360 370 380 

[ 3288 ) > 

OspA-KA 340 350 360 370 . 380 

I 3288 ] > 

OspA-N40 340 350 360 370 380 

I 3276 ] > 

OspA-ZS7 340 350 360 370 380 

[ 3264 ] > 

OspA-25015 340 350 360 370 380 

I 2802 }- ... ag t t g > 

OspA-TRO 340 350 360 370 380 

I 2648 ] a. ..t t t c .c> 

OspA-X48 340 350 360 370 380 

[ 2584 ] c ctt c . . .> 

OspA-HE 11 340 350 360 370 380 

I 2580 ] c ctt c . . .> 

OspA-DK29 340 350 360 370 380 

[ 2566 } c ctt c .g.> 

OspA-Ip90 340 350 360 370 380 

[ 2562 ] c ctt c .c> 

OspA-BO 340 350 360 370 380 

[ 2558 ] g. ..t a a.. t ... .tg > 

OSPA-1P3 340 350 360 370 380 

I 2558 ] g. ..t a a t ... .tg > 

OspA-PKO - 340 350 360 370 380 
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TCG TAC TGT TCT CTT TTA CCT TGG TTT GAA CTT ATA TGT CTT TAC 

460 470 480 490 

****** *** 

AAA AGC GAT GGA ACC GGA AAA GCT AAA GAA GTT TTA AAA AAG TTT 
TTT TCG CTA CCT TGG CCT TTT CGA TTT CTT CAA AAT TTT TTC AAA 

500 510 520 530 540 

* *« * ** W . * * 

ACT CTT GAA GGA AAA GTA GCT AAT GAT AAA GTA ACA TTG GAA GTA 
TGA GAA CTT CCT TTT CAT CGA TTA CTA TTT CAT TGT AAC CTT CAT 

550 560 570 580 

« * * * * * * * * 

AAA GAA GGA ACC GTT ACT TTA AGT AAG GAA ATT GCA AAA TCT GGA 



Figure 44 (1 of 2) 



WO 95/12676 



/^f//33 



P-GAU/BO-OSpA 

Wednesday, April 27, 1934 Ili22 AM 

TTT CTT CCT TGG CAA TOA AAT TCA TTC CXT TAA CGT TIT AGA CCT 

590 €00 610 620 630 

. * • * * * * 

GAA GTA ACA CTT GCT CTT AAT GAC ACT AAC ACT ACT CAO OCT ACT 
CTT CAT TOT CAA CGA GAA TTA CTG TCA TTO TGA TGA GTC CGA TOA 

'• 640 650 , 660 670 

.«••**•** 

AAA AAA ACT GGC GCA TOG GAT TCA AAA ACT TCT ACT TTA ACA ATT 
TTT TTT TGA CCG CGT ACC CTA ACT TTT TGA_AGA_TGA AAT TGT TAA 

680 690 ^ 700 ^ 710 _ 720 

AGT GTT AAC AOC AAA AAA ACT ACA CAA CTT GTC TTT ACT AAA CAA 
TCA CAA TTC TCG TTT TTT TGA TCT GTT GAA CSfcC AAA TGA TTT CTT 



730 



•740 750 760 



GAC ACA ATA ACT GTA CAA AAA TAC GAC TCC GCA GCT ACC AAT TTA 
CTG TGT TAT TGA CAT GTT TTT ATG CTG AGG CGT CCA TGG TTA AAT 

770 -780 790 800 810 

« * * * ' 

GAA GGC ACA GCA GTC GAA ATT AAA ACA CTT GAT GAA CTT AAA AAC 
CTT CCG TGT CGT CAG CTT TAA TTT TGT GAA CTA CTT GAA TTT TTG 

820 

* • 

GCT TTA AAA TAG 
CGA AAT TTT ATC 
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ATG AAA AAA TAT TTA TTG GGA ATA GCT CTA ATA TTA OCC TTA ATA 
TAC TTT TTT ATA AAT AAC CCT TAT OCA CAT TAT AAT COG AAT TAT 

50 60 70 80 <an 

^ **2 MT 011 ^ ^ 07 <»G AAA AAC AGC GTT 
CGT ACA TTC CTT TTA CAA TCG TCG GM CTG CTC TTT TTG TOO CM 

WO 110 120 130 

* • * * * «♦ . . t 

TCA GTA GAT TTG OCT GGT GAA ATC AAA GTT CTT GTA AGC AAA GA\ 
AGT CAT CTA AAC GGA CCA CTT TAC TTT CAA GAA CAT TCG TTT CTT 



"0 150 160 170 



180 



AAA AAC AAA GAC GGC AAG TAC GAT CTA ATT GCA ACA GTA GAC AAC 
TTT TTG TTT CTG CCG TTC ATG CTA GAT TAA CGT TGT CAT CTg'tTC 

130 200 210 220 

CTT GAG CTT AAA GGA ACT TOT GAT AAA AAC AAT GGa'tCT GGA GTA 
GAA CTC GAA TTT CCT TGA AGA CTA TTT TTG TTA CCT AGA OCT CAT 



230 240 250 260 



270 



* ♦ * A. ^ 



CTT GAA GGC GTA AAA GCT GAC AM AGT AAA GTA AAA TTA ACA ATT 
GM CTT CCG CAT TTT CGA CTG TTT TCA TTT CAT TTT MT TGT TAA 

280 290 300 310 

TCT GAC GAT CTA GGT CM ACC ACA CTT GM GTT TTC MA GM GAT 
AGA CTG CTA GAT CCA GTT TGG TGT GAA CTT CAA AAG TTT CTT CTA 



320 330 340 350 



360 



GGC AM ACA CTA GTA TCA AM AM GTA- ACT TCC AAA GAC AAG TCA 
CCG TTT TGT GAT CAT AGT TTT TTT CAT TGA AGG TTT CTG TTC AGT 



370 380 3S0 



400 



TCA ACA GAA GAA AAA TTC AAT GAA AAA GGT GAA GTA TCT GAA AAA 
AGT TGT CTT CTT TTT MG TTA CTT TTT CCA CTT CAT AGA CTT TTT 

41 ? «0 430 440 450 

* • * • • . . # , 

ATA ATA ACA AGA GCA MT GGA ACC AM CTT GM TAT ACA GM ATG 
TAT TAT TGT TCT CGT TTA CCT TGG TTT GM CTT ATA TGT CTT TAC 

460 470 480 490 

* * * * • • » « 

AM AGC GAT GGA ACC GGA AM GCT AM GM GTT TTA AM MG TTT 
TTT TCG CTA CCT TGG CCT TTT CGA TTT CTT CM MT TTT TTC AM 



500 510 520 530 



540 



« 



ty^T ^IT OtK ^ T ^ T ^ A TTG GAA CTA 

TGA GAA CTT CCT TTT CAT CGA TTA CTA TTT CAT TGT AAC CTT CAT 

550 560 570 580 

AAA GAA GGA ACC GTT ACT TTA AGT AAG GAA ATT TCA AAA TCT GGG 
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TTT CTT CCT TGG CAA TGA AAT TCA TTC CTT TAA AGT TTT AGA CCC 

son 600 610 620 630 

* ***** * * * 

GAA CTT TCA GTT GAA CTT AAT GAC ACT GAC AGT AGT GCT GCT ACT 
CTT CAA AQT CAA CTT GAA TTA CTO TGA CTG TCA TCA CCA CGA TGA 

640 650 660 670 

* * « * * * * * • 

AAA AAA ACT CCA GCT TGG AAT TCA AAA ACT TCC ACT TTA ACA ATT 
TTT TTT TGA CGT CGA ACC TTA AGT TIT TGA AGGJTGA AAT TGT TAA 

680 690 700 710 720 

* * ** 

AGT GTG AAT AGC CAA AAA ACC AAA AAC CTT GTA TTC ACA AAA GAA 
TCA CAC TTA TCC GTT TTT TGG TTT TTG GAA CAT AAG TGT TTT CTT 

730 740 750 760 

* * « * * * 

GAC ACA ATA ACA GTA CAA AAA TAC GAC TCA GCA GGC ACC AAT CTA 
CTG TGT TAT TGT CAT GTT TTT ATG CTG AGT CGT CCG TGG TTA GAT 

770 780 790 800 810 

* * * « * * * * 

GAA GGC AAA GCA GTC GAA ATT ACA ACA CTT AAA GAA CTT AAA AAC 
CTT CCG TTT CGT CAG CTT TAA TGT TGT GAA TTT CTT GAA TTT TTG 

820 

* * 

GCT TTA AAA TAA 
CGA AAT TTT ATT 
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. «■ * * * *« « * * 

ATG AAA AAA TAT TTA TTG GGA ATA GGT CTA ATA TTA GCC TTA ATA 
TAC TTT TTT ATA AAT AAC CCT TAT CCA GAT TAT AAT CGG AAT TAT 

50 60 70 80 90 

» .*« 4 ft ft * * * 

GCA TGC AAG CAA AAT GTT AGC AGC CTT GAT GAA AAA AAC AGC OCT 
CGT ACG TTC GTT TTA CAA TCG TCG GAA CTA CTT TTT TTG TCO CGA 

100 110 120 130 
«•**** * » * * 

TCA GTA GAT TTO CCT GGT GAG ATG AAA GTT CTT GTA AGT AAA GAA 
AGT CAT CTA AAC GGA CCA CTC TAC TTT CAA GAA CAT TCA TTT CTT 

140 150 160 170 180 

* * .* * * * * * * 

AAA GAC AAA GAC GGT AAG TAC AGT CTA AAG GCA ACA GTA GAC AAG . 
TTT CTG TTT CTG CCA TTC ATG TCA GAT TTC CGT TGT CAT CTG TTC 

190 200 210 220 

ATT GAG CTA AAA GGA ACT TCT GAT AAA GAC AAT GGT TCT GGA GTG 
TAA CTC GAT TTT CCT TGA AGA CTA TTT CTG TTA CCA AGA CCT CAC 

230 240 250 260 270 

% * * * ** * «-« 

CTT GAA GGT ACA AAA GAT GAC AAA ACT AAA GCA AAA TTA ACA ATT 
GAA CTT CCA TGT TTT CTA CTG TTT TCA TTT CGT TTT AAT TGT TAA 

280 290 300 310 

■ * V 9 * •« * ft ft 

GCT GAC GAT CTA AGT AAA ACC ACA TTC GAA CTT TTA AAA GAA GAT 
CGA CTG CTA GAT TCA TTT TGO TGT AAG CTT GAA AAT TTT CTT CTA 

320 330 340 350 360 

* ft-* *** * mm 

GGC AAA ACA TTA GTG TCA AGA AAA GTA AGT TCT AGA GAC AAA ACA 
CCG TTT TGT AAT CAC AGT TCT TTT CAT TCA AGA TCT CTG TTT TGT 

370 380 390 400 

TCA ACA GAT GAA ATG TTC AAT GAA AAA GGT GAA TTG TCT GCA AAA 
AGT TGT CTA CTT TAC AAG TTA CTT TTT CCA CTT AAC AGA CGT TTT 

410 420 430 440 450 

* * * - w ■* ft. * * 

ACC ATG ACA AGA GAA AAT GGA ACC AAA CTT GAA TAT ACA GAA ATG 
TQG TAC TGT TCT CTT TTA CCT TGG TTT GAA CTT ATA TGT CTT TAC 

460 470 480 490 

* * * * «* 

AAA AGC GAT GGA ACC GGA AAA GCT AAA GAA GTT TTA AAA AAG TTT 
TTT TCG CTA CCT TGG CCT TTT CGA TTT CTT CAA AAT TTT TTC AAA 

500 510 520 530 540 

* % * * t * * A* 

ACT CTT GAA GGA AAA GTA GCT AAT GAT AAA GTA ACA TTG GAA GTA 
TGA GAA CTT CCT TTT CAT CGA TTA CTA TTT CAT TGT AAC CTT CAT 

550 56C 570 580 

« * * # ** 

AAA GAA GGA ACC GTT ACT TTA AGT AAG GAA ATT TCA AAA TCT GGG 
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TTT CTT CCT TGG CAA TGA AAT TCA TTC CTT TAA ACT TIT AGA CCC 

590 600 610 «20 630 

GAA GTT TCA GTT GAA CTT AAT GAC ACT GAC AOT ACT CCT GCT ACT 
CTT CAA AGT CM CTT CAA TTA CTG TGA CTG TCA TCA CGA CGA TGA 

640 650 660 . 670 

* • • * ♦ * * • * * 

AAA AAA ACT GCA GCT TGG AAT TCA AAA ACT TCC ACT TTA ACA ATT 
TTT TTT TGA CGT CGA ACC TTA AGT TTT TGA AGG TGA^AAT TGT TAA 

680 690 700 710 720 

« »»* * « * * ' 

AGT GTG AAT AGC CAA AAA ACC AAA AAC CTT GTA TTC ACA AAA GAA 
TCA CAC TTA TOG GTT TTT TOG TTT TTG GAA CAT AAG TGT TTT CTT 

730 740 750 760 

♦ * * * 

GAC ACA ATA ACA GTA CAA AAA TAC GAC TCA GCA OGC ACC AAT CTA 
CTG TGT TAT TGT CAT GTT TTT ATG CTG AGT CGT CCG TGG TTA GAT 

770 780 790 800 810 

• • • • * * * « ' 

GAA GGC AAA GCA GTC GAA ATT ACA ACA CTT AAA GAA CTT AAA AAC 
CTT CCG TTT CGT CAG CTT TAA TGT TGT GAA TTT CTT GAA TTT TTG 

820 

* •* 

OCT TTA AAA TAA 
CGA AAT TTT ATT 
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